Realistic facial expression synthesis for an image-based talking head

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

View graph of relations

Details

Original languageEnglish
Title of host publicationElectronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo
Subtitle of host publicationICME 2011
Publication statusPublished - Sept 2011
Event2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011 - Barcelona, Spain
Duration: 11 Jul 201115 Jul 2011

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (electronic)1945-788X

Abstract

This paper presents an image-based talking head system that is able to synthesize realistic facial expressions accompanying speech, given arbitrary text input and control tags of facial expression. As an example of facial expression primitives, smile is used. First, three types of videos are recorded: a performer speaking without any expressions, smiling while speaking, and smiling after speaking. By analyzing the recorded audiovisual data, an expressive database is built and contains normalized neutral mouth images and smiling mouth images, as well as their associated features and expressive labels. The expressive talking head is synthesized by an unit selection algorithm, which selects and concatenates appropriate mouth image segments from the expressive database. Experimental results show that the smiles of talking heads are as realistic as the real ones objectively, and the viewers cannot distinguish the real smiles from the synthesized ones.

Keywords

    facial expression, image-based animation, Talking head, unit selection

ASJC Scopus subject areas

Cite this

Realistic facial expression synthesis for an image-based talking head. / Liu, Kang; Ostermann, Joern.
Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo: ICME 2011. 2011. 6011835 (Proceedings - IEEE International Conference on Multimedia and Expo).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Liu, K & Ostermann, J 2011, Realistic facial expression synthesis for an image-based talking head. in Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo: ICME 2011., 6011835, Proceedings - IEEE International Conference on Multimedia and Expo, 2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011, Barcelona, Spain, 11 Jul 2011. https://doi.org/10.1109/ICME.2011.6011835
Liu, K., & Ostermann, J. (2011). Realistic facial expression synthesis for an image-based talking head. In Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo: ICME 2011 Article 6011835 (Proceedings - IEEE International Conference on Multimedia and Expo). https://doi.org/10.1109/ICME.2011.6011835
Liu K, Ostermann J. Realistic facial expression synthesis for an image-based talking head. In Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo: ICME 2011. 2011. 6011835. (Proceedings - IEEE International Conference on Multimedia and Expo). doi: 10.1109/ICME.2011.6011835
Liu, Kang ; Ostermann, Joern. / Realistic facial expression synthesis for an image-based talking head. Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo: ICME 2011. 2011. (Proceedings - IEEE International Conference on Multimedia and Expo).
Download
@inproceedings{ec540e0aeb6d4e0bba6bba1a9d6e2f95,
title = "Realistic facial expression synthesis for an image-based talking head",
abstract = "This paper presents an image-based talking head system that is able to synthesize realistic facial expressions accompanying speech, given arbitrary text input and control tags of facial expression. As an example of facial expression primitives, smile is used. First, three types of videos are recorded: a performer speaking without any expressions, smiling while speaking, and smiling after speaking. By analyzing the recorded audiovisual data, an expressive database is built and contains normalized neutral mouth images and smiling mouth images, as well as their associated features and expressive labels. The expressive talking head is synthesized by an unit selection algorithm, which selects and concatenates appropriate mouth image segments from the expressive database. Experimental results show that the smiles of talking heads are as realistic as the real ones objectively, and the viewers cannot distinguish the real smiles from the synthesized ones.",
keywords = "facial expression, image-based animation, Talking head, unit selection",
author = "Kang Liu and Joern Ostermann",
year = "2011",
month = sep,
doi = "10.1109/ICME.2011.6011835",
language = "English",
isbn = "9781612843490",
series = "Proceedings - IEEE International Conference on Multimedia and Expo",
booktitle = "Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo",
note = "2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011 ; Conference date: 11-07-2011 Through 15-07-2011",

}

Download

TY - GEN

T1 - Realistic facial expression synthesis for an image-based talking head

AU - Liu, Kang

AU - Ostermann, Joern

PY - 2011/9

Y1 - 2011/9

N2 - This paper presents an image-based talking head system that is able to synthesize realistic facial expressions accompanying speech, given arbitrary text input and control tags of facial expression. As an example of facial expression primitives, smile is used. First, three types of videos are recorded: a performer speaking without any expressions, smiling while speaking, and smiling after speaking. By analyzing the recorded audiovisual data, an expressive database is built and contains normalized neutral mouth images and smiling mouth images, as well as their associated features and expressive labels. The expressive talking head is synthesized by an unit selection algorithm, which selects and concatenates appropriate mouth image segments from the expressive database. Experimental results show that the smiles of talking heads are as realistic as the real ones objectively, and the viewers cannot distinguish the real smiles from the synthesized ones.

AB - This paper presents an image-based talking head system that is able to synthesize realistic facial expressions accompanying speech, given arbitrary text input and control tags of facial expression. As an example of facial expression primitives, smile is used. First, three types of videos are recorded: a performer speaking without any expressions, smiling while speaking, and smiling after speaking. By analyzing the recorded audiovisual data, an expressive database is built and contains normalized neutral mouth images and smiling mouth images, as well as their associated features and expressive labels. The expressive talking head is synthesized by an unit selection algorithm, which selects and concatenates appropriate mouth image segments from the expressive database. Experimental results show that the smiles of talking heads are as realistic as the real ones objectively, and the viewers cannot distinguish the real smiles from the synthesized ones.

KW - facial expression

KW - image-based animation

KW - Talking head

KW - unit selection

UR - http://www.scopus.com/inward/record.url?scp=80155129710&partnerID=8YFLogxK

U2 - 10.1109/ICME.2011.6011835

DO - 10.1109/ICME.2011.6011835

M3 - Conference contribution

AN - SCOPUS:80155129710

SN - 9781612843490

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo

T2 - 2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011

Y2 - 11 July 2011 through 15 July 2011

ER -

By the same author(s)