Title: | TrellisNet-Based Architecture for Sound Event Localization and Detection with Reassembly Learning |
Authors: | Park, Sooyoung |
Date Issued: | Oct-2019 |
Citation: | S. Park, "TrellisNet-Based Architecture for Sound Event Localization and Detection with Reassembly Learning", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 179–183, New York University, NY, USA, Oct. 2019 |
Abstract: | This paper proposes a deep learning technique and network model for DCASE 2019 task 3: Sound Event Localization and Detection. Currently, the convolutional recurrent neural network is known as the state-of-the-art technique for sound classification and detection. We focus on proposing TrellisNet-based architecture that can replace the convolutional recurrent neural network. Our TrellisNet-based architecture has better performance in the direction of arrival estimation compared to the convolutional recurrent neural network. We also propose reassembly learning to design a single network that handles dependent sub-tasks together. Reassembly learning is a method to divide multi-task into individual sub-tasks, to train each sub-task, then reassemble and fine-tune them into a single network. Experimental results show that the proposed method improves sound event localization and detection performance compared to the DCASE 2019 baseline system. |
First Page: | 179 |
Last Page: | 183 |
DOI: | https://doi.org/10.33682/nzsm-jr50 |
Type: | Article |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Park_76.pdf | 648.5 kB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.