From audio-only to audio and video text-to-speech

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Autorschaft

Externe Organisationen

  • NEC Laboratories America, Inc.
  • AT&T Labs
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Seiten (von - bis)1084-1095
Seitenumfang12
FachzeitschriftActa Acustica united with Acustica
Jahrgang90
Ausgabenummer6
PublikationsstatusVeröffentlicht - Nov. 2004

Abstract

Progress mae with the AT&T sample-based visual text-to-speech (VTTS) system is discussed. The VTTS system from AT&T incorporates unit selection synthesis and a moderate size recorded database of modified and concatenated video segments. It is suggested that several steps such as highly accurate image analysis tools for creating video clip databases, fast research techniques and rendering of composite face images on a graphic screen are very important to assure a high quality sample based VTTS system. It was found that accuracy and timeliness of lip closures and protrusions, turning points and overall smoothness are very critical for the system.

ASJC Scopus Sachgebiete

Zitieren

From audio-only to audio and video text-to-speech. / Cosatto, Eric; Graf, Hans Peter; Ostermann, Jörn et al.
in: Acta Acustica united with Acustica, Jahrgang 90, Nr. 6, 11.2004, S. 1084-1095.

Publikation: Beitrag in FachzeitschriftArtikelForschungPeer-Review

Cosatto, E, Graf, HP, Ostermann, J & Schroeter, J 2004, 'From audio-only to audio and video text-to-speech', Acta Acustica united with Acustica, Jg. 90, Nr. 6, S. 1084-1095.
Cosatto, E., Graf, H. P., Ostermann, J., & Schroeter, J. (2004). From audio-only to audio and video text-to-speech. Acta Acustica united with Acustica, 90(6), 1084-1095.
Cosatto, Eric ; Graf, Hans Peter ; Ostermann, Jörn et al. / From audio-only to audio and video text-to-speech. in: Acta Acustica united with Acustica. 2004 ; Jahrgang 90, Nr. 6. S. 1084-1095.
Download
@article{d0508de0cb514692871afd5ed2ec9962,
title = "From audio-only to audio and video text-to-speech",
abstract = "Progress mae with the AT&T sample-based visual text-to-speech (VTTS) system is discussed. The VTTS system from AT&T incorporates unit selection synthesis and a moderate size recorded database of modified and concatenated video segments. It is suggested that several steps such as highly accurate image analysis tools for creating video clip databases, fast research techniques and rendering of composite face images on a graphic screen are very important to assure a high quality sample based VTTS system. It was found that accuracy and timeliness of lip closures and protrusions, turning points and overall smoothness are very critical for the system.",
author = "Eric Cosatto and Graf, {Hans Peter} and J{\"o}rn Ostermann and Juergen Schroeter",
year = "2004",
month = nov,
language = "English",
volume = "90",
pages = "1084--1095",
journal = "Acta Acustica united with Acustica",
issn = "1610-1928",
publisher = "S. Hirzel Verlag GmbH",
number = "6",

}

Download

TY - JOUR

T1 - From audio-only to audio and video text-to-speech

AU - Cosatto, Eric

AU - Graf, Hans Peter

AU - Ostermann, Jörn

AU - Schroeter, Juergen

PY - 2004/11

Y1 - 2004/11

N2 - Progress mae with the AT&T sample-based visual text-to-speech (VTTS) system is discussed. The VTTS system from AT&T incorporates unit selection synthesis and a moderate size recorded database of modified and concatenated video segments. It is suggested that several steps such as highly accurate image analysis tools for creating video clip databases, fast research techniques and rendering of composite face images on a graphic screen are very important to assure a high quality sample based VTTS system. It was found that accuracy and timeliness of lip closures and protrusions, turning points and overall smoothness are very critical for the system.

AB - Progress mae with the AT&T sample-based visual text-to-speech (VTTS) system is discussed. The VTTS system from AT&T incorporates unit selection synthesis and a moderate size recorded database of modified and concatenated video segments. It is suggested that several steps such as highly accurate image analysis tools for creating video clip databases, fast research techniques and rendering of composite face images on a graphic screen are very important to assure a high quality sample based VTTS system. It was found that accuracy and timeliness of lip closures and protrusions, turning points and overall smoothness are very critical for the system.

UR - http://www.scopus.com/inward/record.url?scp=11244348117&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:11244348117

VL - 90

SP - 1084

EP - 1095

JO - Acta Acustica united with Acustica

JF - Acta Acustica united with Acustica

SN - 1610-1928

IS - 6

ER -

Von denselben Autoren