Loading [MathJax]/extensions/tex2jax.js

Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Nils Poschadel
  • Stephan Preihs
  • Jürgen Peissig

Details

Original languageEnglish
Title of host publication29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings
Pages1015-1019
Number of pages5
ISBN (electronic)9789082797060
Publication statusPublished - 2021
Event29th European Signal Processing Conference, EUSIPCO 2021 - Dublin, Ireland
Duration: 23 Aug 202127 Aug 2021

Publication series

NameEuropean Signal Processing Conference
Volume2021-August
ISSN (Print)2219-5491
ISSN (electronic)2076-1465

Abstract

Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

Keywords

    Convolutional recurrent neural network, Higher-order ambisonics, Multi-source direction of arrival estimation, Spherical harmonics

ASJC Scopus subject areas

Cite this

Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. / Poschadel, Nils; Preihs, Stephan; Peissig, Jürgen.
29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. p. 1015-1019 (European Signal Processing Conference; Vol. 2021-August).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Poschadel, N, Preihs, S & Peissig, J 2021, Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. in 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. European Signal Processing Conference, vol. 2021-August, pp. 1015-1019, 29th European Signal Processing Conference, EUSIPCO 2021, Dublin, Ireland, 23 Aug 2021. https://doi.org/10.23919/EUSIPCO54536.2021.9616002
Poschadel, N., Preihs, S., & Peissig, J. (2021). Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. In 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings (pp. 1015-1019). (European Signal Processing Conference; Vol. 2021-August). https://doi.org/10.23919/EUSIPCO54536.2021.9616002
Poschadel N, Preihs S, Peissig J. Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. In 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. p. 1015-1019. (European Signal Processing Conference). doi: 10.23919/EUSIPCO54536.2021.9616002
Poschadel, Nils ; Preihs, Stephan ; Peissig, Jürgen. / Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals. 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings. 2021. pp. 1015-1019 (European Signal Processing Conference).
Download
@inproceedings{1c0eaf0bf3cd4b4bb638074e725f7f83,
title = "Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals",
abstract = "Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.",
keywords = "Convolutional recurrent neural network, Higher-order ambisonics, Multi-source direction of arrival estimation, Spherical harmonics",
author = "Nils Poschadel and Stephan Preihs and J{\"u}rgen Peissig",
year = "2021",
doi = "10.23919/EUSIPCO54536.2021.9616002",
language = "English",
isbn = "978-1-6654-0900-1",
series = "European Signal Processing Conference",
pages = "1015--1019",
booktitle = "29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings",
note = "29th European Signal Processing Conference, EUSIPCO 2021 ; Conference date: 23-08-2021 Through 27-08-2021",

}

Download

TY - GEN

T1 - Multi-Source Direction of Arrival Estimation of Noisy Speech using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

AU - Poschadel, Nils

AU - Preihs, Stephan

AU - Peissig, Jürgen

PY - 2021

Y1 - 2021

N2 - Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

AB - Convolutional recurrent neural networks provide state of the art results in direction of arrival estimation based on first-order Ambisonics signals, especially in the presence of noise and/or interfering sound sources. In this work, we investigate whether increasing the order of Ambisonics up to the fourth order further improves the estimation results in a challenging multi-speaker setting with two or three simultaneously active speakers. Our results show that each additional order of the Ambisonics representation further improves the localization performance for both speech signals based on simulated and real measured spatial room impulse responses. The greatest gains in accuracy can be observed in the particularly demanding scenarios with three speakers and poor signal-to-interference-ratio.

KW - Convolutional recurrent neural network

KW - Higher-order ambisonics

KW - Multi-source direction of arrival estimation

KW - Spherical harmonics

UR - http://www.scopus.com/inward/record.url?scp=85123163406&partnerID=8YFLogxK

U2 - 10.23919/EUSIPCO54536.2021.9616002

DO - 10.23919/EUSIPCO54536.2021.9616002

M3 - Conference contribution

AN - SCOPUS:85123163406

SN - 978-1-6654-0900-1

T3 - European Signal Processing Conference

SP - 1015

EP - 1019

BT - 29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings

T2 - 29th European Signal Processing Conference, EUSIPCO 2021

Y2 - 23 August 2021 through 27 August 2021

ER -