Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Nils Poschadel
  • Stephan Preihs
  • Jürgen Peissig

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings
Seiten1015-1019
Seitenumfang5
ISBN (elektronisch)9789082797060
PublikationsstatusVeröffentlicht - 2021
Veranstaltung29th European Signal Processing Conference, EUSIPCO 2021 - Dublin, Irland
Dauer: 23 Aug. 202127 Aug. 2021

Publikationsreihe

NameEuropean Signal Processing Conference
Band2021-August
ISSN (Print)2219-5491
ISSN (elektronisch)2076-1465

Abstract

Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

ASJC Scopus Sachgebiete

Zitieren

Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. / Poschadel, Nils; Preihs, Stephan; Peissig, Jürgen.
29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. S. 1015-1019 (European Signal Processing Conference; Band 2021-August).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Poschadel, N, Preihs, S & Peissig, J 2021, Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. in 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. European Signal Processing Conference, Bd. 2021-August, S. 1015-1019, 29th European Signal Processing Conference, EUSIPCO 2021, Dublin, Irland, 23 Aug. 2021. https://doi.org/10.23919/EUSIPCO54536.2021.9616002
Poschadel, N., Preihs, S., & Peissig, J. (2021). Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. In 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings (S. 1015-1019). (European Signal Processing Conference; Band 2021-August). https://doi.org/10.23919/EUSIPCO54536.2021.9616002
Poschadel N, Preihs S, Peissig J. Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. in 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. S. 1015-1019. (European Signal Processing Conference). doi: 10.23919/EUSIPCO54536.2021.9616002
Poschadel, Nils ; Preihs, Stephan ; Peissig, Jürgen. / Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. S. 1015-1019 (European Signal Processing Conference).
Download
@inproceedings{1c0eaf0bf3cd4b4bb638074e725f7f83,
title = "Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals",
abstract = "Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.",
keywords = "Convolutional recurrent neural network, Higher-order ambisonics, Multi-source direction of arrival estimation, Spherical harmonics",
author = "Nils Poschadel and Stephan Preihs and J{\"u}rgen Peissig",
year = "2021",
doi = "10.23919/EUSIPCO54536.2021.9616002",
language = "English",
isbn = "978-1-6654-0900-1",
series = "European Signal Processing Conference",
pages = "1015--1019",
booktitle = "29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings",
note = "29th European Signal Processing Conference, EUSIPCO 2021 ; Conference date: 23-08-2021 Through 27-08-2021",

}

Download

TY - GEN

T1 - Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

AU - Poschadel, Nils

AU - Preihs, Stephan

AU - Peissig, Jürgen

PY - 2021

Y1 - 2021

N2 - Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

AB - Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

KW - Convolutional recurrent neural network

KW - Higher-order ambisonics

KW - Multi-source direction of arrival estimation

KW - Spherical harmonics

UR - http://www.scopus.com/inward/record.url?scp=85123163406&partnerID=8YFLogxK

U2 - 10.23919/EUSIPCO54536.2021.9616002

DO - 10.23919/EUSIPCO54536.2021.9616002

M3 - Conference contribution

AN - SCOPUS:85123163406

SN - 978-1-6654-0900-1

T3 - European Signal Processing Conference

SP - 1015

EP - 1019

BT - 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings

T2 - 29th European Signal Processing Conference, EUSIPCO 2021

Y2 - 23 August 2021 through 27 August 2021

ER -