This webpage present qualitative results for our submission:
FlexLip: A controllable text-to-lip system.
Dan Oneață, Beáta Lőrincz, Adriana Stan, Horia Cucu.
Submitted at Sensors, 2022.
The following samples are generated with the proposed text-to-speech component. They correspond to the following three sentences:
System id | Sample 1 | Sample 2 | Sample 3 |
---|---|---|---|
Natural |
|||
O-8 |
|||
O-8-LJ |
|||
O-8-LT |
|||
O-8-LJ-dvb |
|||
O-8-LT-dvb |
|||
O-1-LJ |
|||
O-1-LT |
|||
O-1-LJ-dvb |
|||
O-1-LT-dvb |
|||
O-0.3-LJ |
|||
O-0.3-LT |
|||
O-0.3-LJ-dvb |
|||
O-0.3-LT-dvb |
Here we present results for the full pipeline, which goes from text to keypoints. The results correspond to section §5.3 in the paper.
Below we show results when applying the Obama pretrained model on audio data collected from Trump. We present two cases:
xrPZBTNjX_o-000-023
xrPZBTNjX_o-000-034
xrPZBTNjX_o-000-039
xrPZBTNjX_o-000-045