Title: | Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection |
Authors: | Sudo, Yui Itoyama, Katsutoshi Nishida, Kenji Nakadai, Kazuhiro |
Date Issued: | Oct-2019 |
Citation: | Y. Sudo, K. Itoyama, K. Nishida & K. Nakadai, "Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 244–247, New York University, NY, USA, Oct. 2019 |
Abstract: | This paper describes improvement of Direction of Arrival (DOA) estimation performance using quaternion output in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Task 3. DCASE 2019 Task3 focuses on the sound event localization and detection (SELD) which is a task that simultaneously estimates the sound source direction in addition to conventional sound event detection (SED). In the baseline method, the sound source direction angle is directly regressed. However, the angle is a periodic function and it has discontinuities which may make learning unstable. Specifical-ly, even though -180 deg and 180 deg are in the same direc-tion, a large loss is calculated. Estimating DOA angles with a classification approach instead of regression can solve such instability of discontinuities but this causes limitation of reso-lution. In this paper, we propose to introduce the quaternion which is a continuous function into the output layer of the neural network instead of directly estimating the sound source direction angle. This method can be easily implemented only by changing the output of the existing neural network, and thus does not significantly increase the number of parameters in the middle layers. Experimental results show that proposed method improves the DOA estimation without significantly increasing the number of parameters. |
First Page: | 244 |
Last Page: | 247 |
DOI: | https://doi.org/10.33682/jj50-hm12 |
Type: | Article |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Sudo_81.pdf | 396.27 kB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.