Skip navigation
Title: 

GCC-PHAT Cross-Correlation Audio Features for Simultaneous Sound Event Localization and Detection (SELD) on Multiple Rooms

Authors: Cordourier, Hector
Lopez Meyer, Paulo
Huang, Jonathan
Del Hoyo Ontiveros, Juan
Lu, Hong
Date Issued: Oct-2019
Citation: H. Cordourier, P. Meyer, J. Huang, J. Ontiveros & H. Lu, "GCC-PHAT Cross-Correlation Audio Features for Simultaneous Sound Event Localization and Detection (SELD) on Multiple Rooms", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 55–58, New York University, NY, USA, Oct. 2019
Abstract: In this work, we show a simultaneous sound event localization and detection (SELD) system, with enhanced acoustic features, in which we propose using the well-known Generalized Cross Correlation (GCC) PATH algorithm, to augment the magnitude and phase regular Fourier spectra features at each frame. GCC-PHAT has already been used for some time to calculate the Time Difference of Arrival (TDOA) in simultaneous audio signals, in moderately reverberant environments, using classic signal processing techniques, and can assist audio source localization in current deep learning machines. The neural net architecture we used is a Convolutional Recurrent Neural Network (CRNN), and is tested using the sound database prepared for the Task 3 of the 2019 DCASE Challenge. In the challenge results, our proposed system was able to achieve 20.8° of direction of arrival error, 85.6\% frame recall, 86.5\% F-score and 0.22 error rate detection in evaluation samples.
First Page: 55
Last Page: 58
DOI: https://doi.org/10.33682/3re4-nd65
Type: Article
Appears in Collections:Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:
File SizeFormat 
DCASE2019Workshop_CordourierMaruri_59.pdf1.14 MBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.