Skip navigation
Title: 

Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection

Authors: Sudo, Yui
Itoyama, Katsutoshi
Nishida, Kenji
Nakadai, Kazuhiro
Date Issued: Oct-2019
Citation: Y. Sudo, K. Itoyama, K. Nishida & K. Nakadai, "Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 244–247, New York University, NY, USA, Oct. 2019
Abstract: This paper describes improvement of Direction of Arrival (DOA) estimation performance using quaternion output in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Task 3. DCASE 2019 Task3 focuses on the sound event localization and detection (SELD) which is a task that simultaneously estimates the sound source direction in addition to conventional sound event detection (SED). In the baseline method, the sound source direction angle is directly regressed. However, the angle is a periodic function and it has discontinuities which may make learning unstable. Specifical-ly, even though -180 deg and 180 deg are in the same direc-tion, a large loss is calculated. Estimating DOA angles with a classification approach instead of regression can solve such instability of discontinuities but this causes limitation of reso-lution. In this paper, we propose to introduce the quaternion which is a continuous function into the output layer of the neural network instead of directly estimating the sound source direction angle. This method can be easily implemented only by changing the output of the existing neural network, and thus does not significantly increase the number of parameters in the middle layers. Experimental results show that proposed method improves the DOA estimation without significantly increasing the number of parameters.
First Page: 244
Last Page: 247
DOI: https://doi.org/10.33682/jj50-hm12
Type: Article
Appears in Collections:Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:
File SizeFormat 
DCASE2019Workshop_Sudo_81.pdf396.27 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.