Details
Original language | English |
---|---|
Pages | 571-574 |
Number of pages | 4 |
Publication status | Published - 2000 |
Externally published | Yes |
Event | 2000 IEEE Internatinal Conference on Multimedia and Expo (ICME 2000) - New York, NY, United States Duration: 30 Jul 2000 → 2 Aug 2000 |
Conference
Conference | 2000 IEEE Internatinal Conference on Multimedia and Expo (ICME 2000) |
---|---|
Country/Territory | United States |
City | New York, NY |
Period | 30 Jul 2000 → 2 Aug 2000 |
Abstract
Multimodal Speech Synthesis ("Talking Heads") encompasses synthesis of speech from text ("Text-to-Speech", TTS) plus synthesis of a visual presentation of a face that is lip-synced to the generated audio ("Visual TTS", VTTS). Talking Heads are now practical because of the ever-increasing computing power and falling prices of computer hardware. This paper highlights recent technological breakthroughs relevant to the two moralities. In addition, it exposes synergies between the audio and visual technology components. Finally, the paper summarizes test results that highlight the impact of Multimodal Speech Synthesis in communications and e-commerce applications.
ASJC Scopus subject areas
- Engineering(all)
- General Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2000. 571-574 Paper presented at 2000 IEEE Internatinal Conference on Multimedia and Expo (ICME 2000), New York, NY, United States.
Research output: Contribution to conference › Paper › Research › peer review
}
TY - CONF
T1 - Multimodal Speech Synthesis
AU - Schroeter, J.
AU - Ostermann, J.
AU - Graf, H. P.
AU - Beutnagel, M.
AU - Cosatto, E.
AU - Syrdal, A.
AU - Conkie, A.
AU - Stylianou, Y.
PY - 2000
Y1 - 2000
N2 - Multimodal Speech Synthesis ("Talking Heads") encompasses synthesis of speech from text ("Text-to-Speech", TTS) plus synthesis of a visual presentation of a face that is lip-synced to the generated audio ("Visual TTS", VTTS). Talking Heads are now practical because of the ever-increasing computing power and falling prices of computer hardware. This paper highlights recent technological breakthroughs relevant to the two moralities. In addition, it exposes synergies between the audio and visual technology components. Finally, the paper summarizes test results that highlight the impact of Multimodal Speech Synthesis in communications and e-commerce applications.
AB - Multimodal Speech Synthesis ("Talking Heads") encompasses synthesis of speech from text ("Text-to-Speech", TTS) plus synthesis of a visual presentation of a face that is lip-synced to the generated audio ("Visual TTS", VTTS). Talking Heads are now practical because of the ever-increasing computing power and falling prices of computer hardware. This paper highlights recent technological breakthroughs relevant to the two moralities. In addition, it exposes synergies between the audio and visual technology components. Finally, the paper summarizes test results that highlight the impact of Multimodal Speech Synthesis in communications and e-commerce applications.
UR - http://www.scopus.com/inward/record.url?scp=0034509487&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:0034509487
SP - 571
EP - 574
T2 - 2000 IEEE Internatinal Conference on Multimedia and Expo (ICME 2000)
Y2 - 30 July 2000 through 2 August 2000
ER -