TrellisNet-Based Architecture for Sound Event Localization and Detection with Reassembly Learning

Park, Sooyoung

doi:https://doi.org/10.33682/nzsm-jr50

Title:	TrellisNet-Based Architecture for Sound Event Localization and Detection with Reassembly Learning
Authors:	Park, Sooyoung
Date Issued:	Oct-2019
Citation:	S. Park, "TrellisNet-Based Architecture for Sound Event Localization and Detection with Reassembly Learning", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 179–183, New York University, NY, USA, Oct. 2019
Abstract:	This paper proposes a deep learning technique and network model for DCASE 2019 task 3: Sound Event Localization and Detection. Currently, the convolutional recurrent neural network is known as the state-of-the-art technique for sound classification and detection. We focus on proposing TrellisNet-based architecture that can replace the convolutional recurrent neural network. Our TrellisNet-based architecture has better performance in the direction of arrival estimation compared to the convolutional recurrent neural network. We also propose reassembly learning to design a single network that handles dependent sub-tasks together. Reassembly learning is a method to divide multi-task into individual sub-tasks, to train each sub-task, then reassemble and fine-tune them into a single network. Experimental results show that the proposed method improves sound event localization and detection performance compared to the DCASE 2019 baseline system.
First Page:	179
Last Page:	183
DOI:	https://doi.org/10.33682/nzsm-jr50
Type:	Article
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Park_76.pdf	648.5 kB	Adobe PDF	View/Open

Show full item record