Title: | Sound Event Localization and Detection using CRNN Architecture with Mixup for Model Generalization |
Authors: | Pratik, Pranay Jee, Wen Jie Nagisetty, Srikanth Mars, Rohith Lim, Chongsoon |
Date Issued: | Oct-2019 |
Citation: | P. Pratik, W. Jee, S. Nagisetty, R. Mars & C. Lim, "Sound Event Localization and Detection using CRNN Architecture with Mixup for Model Generalization", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 199–203, New York University, NY, USA, Oct. 2019 |
Abstract: | In this paper, we present the details of our solution for the IEEE DCASE 2019 Task 3: Sound Event Localization and Detection (SELD) challenge. Given multi-channel audio as input, goal is to predict all instances of the sound labels and their directions-of-arrival (DOAs) in the form of azimuth and elevation angles. Our solution is based on Convolutional-Recurrent Neural Network (CRNN) architecture. In the CNN module of the proposed architecture, we introduced rectangular kernels in the pooling layers to minimize the information loss in temporal dimension within the CNN module, leading to boosting up the RNN module performance. Data augmentation mixup is applied in an attempt to train the network for greater generalization. The performance of the proposed architecture was evaluated with individual metrics, for sound event detection (SED) and localization task. Our team’s solution was ranked 5th in the DCASE-2019 Task-3 challenge with an F-score of 93.7% & Error Rate 0.12 for SED task and DOA error of 4.2° & frame recall 91.8% for localization task, both on the evaluation set. This results showed a significant performance improvement for both SED and localization estimation over the baseline system. |
First Page: | 199 |
Last Page: | 203 |
DOI: | https://doi.org/10.33682/gbfk-re38 |
Type: | Article |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Pratik_72.pdf | 685.51 kB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.