Details
Originalsprache | Englisch |
---|---|
Aufsatznummer | 16 |
Seitenumfang | 24 |
Fachzeitschrift | Eurasip Journal on Audio, Speech, and Music Processing |
Jahrgang | 2024 |
Publikationsstatus | Veröffentlicht - 27 März 2024 |
Abstract
Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an authentic virtual auditory object. Reverberation time (RT60) is a predominant metric for the characterization of room acoustics and numerous approaches have been proposed to estimate it blindly from a reverberant speech signal. However, a single RT60 value may not be sufficient to correctly describe and render the acoustics of a room. This contribution presents a method for the estimation of multiple room acoustic parameters required to render close-to-accurate room acoustics in an unknown environment. It is shown how these parameters can be estimated blindly using an audio transformer that can be deployed on a mobile device. Furthermore, the paper also discusses the use of the estimated room acoustic parameters to find a similar room from a dataset of real BRIRs that can be further used for rendering the virtual audio source. Additionally, a novel binaural room impulse response (BRIR) augmentation technique to overcome the limitation of inadequate data is proposed. Finally, the proposed method is validated perceptually by means of a listening test.
ASJC Scopus Sachgebiete
- Physik und Astronomie (insg.)
- Akustik und Ultraschall
- Ingenieurwesen (insg.)
- Elektrotechnik und Elektronik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: Eurasip Journal on Audio, Speech, and Music Processing, Jahrgang 2024, 16, 27.03.2024.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment
AU - Saini, Shivam
AU - Engel, Isaac
AU - Peissig, Jürgen
N1 - Funding Information: This work is part of the PhD research of Shivam Saini at Leibniz Universität Hannover commissioned by Huawei Technologies Düsseldorf GmbH.
PY - 2024/3/27
Y1 - 2024/3/27
N2 - Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an authentic virtual auditory object. Reverberation time (RT60) is a predominant metric for the characterization of room acoustics and numerous approaches have been proposed to estimate it blindly from a reverberant speech signal. However, a single RT60 value may not be sufficient to correctly describe and render the acoustics of a room. This contribution presents a method for the estimation of multiple room acoustic parameters required to render close-to-accurate room acoustics in an unknown environment. It is shown how these parameters can be estimated blindly using an audio transformer that can be deployed on a mobile device. Furthermore, the paper also discusses the use of the estimated room acoustic parameters to find a similar room from a dataset of real BRIRs that can be further used for rendering the virtual audio source. Additionally, a novel binaural room impulse response (BRIR) augmentation technique to overcome the limitation of inadequate data is proposed. Finally, the proposed method is validated perceptually by means of a listening test.
AB - Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an authentic virtual auditory object. Reverberation time (RT60) is a predominant metric for the characterization of room acoustics and numerous approaches have been proposed to estimate it blindly from a reverberant speech signal. However, a single RT60 value may not be sufficient to correctly describe and render the acoustics of a room. This contribution presents a method for the estimation of multiple room acoustic parameters required to render close-to-accurate room acoustics in an unknown environment. It is shown how these parameters can be estimated blindly using an audio transformer that can be deployed on a mobile device. Furthermore, the paper also discusses the use of the estimated room acoustic parameters to find a similar room from a dataset of real BRIRs that can be further used for rendering the virtual audio source. Additionally, a novel binaural room impulse response (BRIR) augmentation technique to overcome the limitation of inadequate data is proposed. Finally, the proposed method is validated perceptually by means of a listening test.
KW - Binaural rendering
KW - BRIR augmentation
KW - Reverberation time estimation
UR - http://www.scopus.com/inward/record.url?scp=85189182996&partnerID=8YFLogxK
U2 - 10.1186/s13636-024-00338-6
DO - 10.1186/s13636-024-00338-6
M3 - Article
AN - SCOPUS:85189182996
VL - 2024
JO - Eurasip Journal on Audio, Speech, and Music Processing
JF - Eurasip Journal on Audio, Speech, and Music Processing
SN - 1687-4714
M1 - 16
ER -