Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 1406-1428 |
Seitenumfang | 23 |
Fachzeitschrift | Proceedings of the IEEE |
Jahrgang | 91 |
Ausgabenummer | 9 |
Publikationsstatus | Veröffentlicht - Sept. 2003 |
Extern publiziert | Ja |
Abstract
Lifelike talking faces for interactive services are an exciting new modality for man-machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating lifelike talking heads, illustrating the two main approaches: model-based animations and sample-based animations. The traditional model-based approach uses three-dimensional wire-frame models, which can be animated from high-level parameters such as muscle actions, lip postures, and facial expressions. The sample-based approach, on the other hand, concatenates segments of recorded videos, instead of trying to model the dynamics of the animations in detail. Recent advances in image analysis enable the creation of large databases of mouth and eye images, suited for sample-based animations. The sample-based approach tends to generate more naturally looking animations at the expense of a larger size and less flexibility than the model-based animations. Beside lip articulation, a talking head must show appropriate head movements, in order to appear natural. We illustrate how such "visual prosody" is analyzed and added to the animations. Finally, we present four applications where the use of face animation in interactive services results in engaging user interfaces and an increased level of trust between user and machine. Using an RTF-based protocol, face animation can be driven with only 800 bits/s in addition to the rate for transmitting audio.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Allgemeine Computerwissenschaft
- Ingenieurwesen (insg.)
- Elektrotechnik und Elektronik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Proceedings of the IEEE, Jahrgang 91, Nr. 9, 09.2003, S. 1406-1428.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Lifelike talking faces for interactive services
AU - Cosatto, Eric
AU - Ostermann, Jörn
AU - Graf, Hans Peter
AU - Schroeter, Juergen
PY - 2003/9
Y1 - 2003/9
N2 - Lifelike talking faces for interactive services are an exciting new modality for man-machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating lifelike talking heads, illustrating the two main approaches: model-based animations and sample-based animations. The traditional model-based approach uses three-dimensional wire-frame models, which can be animated from high-level parameters such as muscle actions, lip postures, and facial expressions. The sample-based approach, on the other hand, concatenates segments of recorded videos, instead of trying to model the dynamics of the animations in detail. Recent advances in image analysis enable the creation of large databases of mouth and eye images, suited for sample-based animations. The sample-based approach tends to generate more naturally looking animations at the expense of a larger size and less flexibility than the model-based animations. Beside lip articulation, a talking head must show appropriate head movements, in order to appear natural. We illustrate how such "visual prosody" is analyzed and added to the animations. Finally, we present four applications where the use of face animation in interactive services results in engaging user interfaces and an increased level of trust between user and machine. Using an RTF-based protocol, face animation can be driven with only 800 bits/s in addition to the rate for transmitting audio.
AB - Lifelike talking faces for interactive services are an exciting new modality for man-machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating lifelike talking heads, illustrating the two main approaches: model-based animations and sample-based animations. The traditional model-based approach uses three-dimensional wire-frame models, which can be animated from high-level parameters such as muscle actions, lip postures, and facial expressions. The sample-based approach, on the other hand, concatenates segments of recorded videos, instead of trying to model the dynamics of the animations in detail. Recent advances in image analysis enable the creation of large databases of mouth and eye images, suited for sample-based animations. The sample-based approach tends to generate more naturally looking animations at the expense of a larger size and less flexibility than the model-based animations. Beside lip articulation, a talking head must show appropriate head movements, in order to appear natural. We illustrate how such "visual prosody" is analyzed and added to the animations. Finally, we present four applications where the use of face animation in interactive services results in engaging user interfaces and an increased level of trust between user and machine. Using an RTF-based protocol, face animation can be driven with only 800 bits/s in addition to the rate for transmitting audio.
KW - Avatar
KW - Computer graphics
KW - Face animation
KW - MPEG-4
KW - Sample-based graphics
KW - Speech synthesizer
KW - Text-to-speech (TTS)
KW - Video-based rendering
KW - Visual text-to-speech (VTTS)
UR - http://www.scopus.com/inward/record.url?scp=10044281988&partnerID=8YFLogxK
U2 - 10.1109/JPROC.2003.817141
DO - 10.1109/JPROC.2003.817141
M3 - Article
AN - SCOPUS:10044281988
VL - 91
SP - 1406
EP - 1428
JO - Proceedings of the IEEE
JF - Proceedings of the IEEE
SN - 0018-9219
IS - 9
ER -