Title: | Audio Tagging using Linear Noise Modelling Layer |
Authors: | Singh, Shubhr Pankajakshan, Arjun Benetos, Emmanouil |
Date Issued: | Oct-2019 |
Citation: | S. Singh, A. Pankajakshan & E. Benetos, "Audio Tagging using Linear Noise Modelling Layer", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 234–238, New York University, NY, USA, Oct. 2019 |
Abstract: | Label noise refers to the presence of inaccurate target labels in a dataset. It is an impediment to the performance of a deep neural network (DNN) as the network tends to overfit to the label noise, hence it becomes imperative to devise a generic methodology to counter the effects of label noise. FSDnoisy18k is an audio dataset collected with the aim of encouraging research on label noise for sound event classification. The dataset contains ∼42.5 hours of audio recordings divided across 20 classes, with a small amount of manually verified labels and a large amount of noisy data. Using this dataset, our work intends to explore the potential of modelling the label noise distribution by adding a linear layer on top of a baseline network. The accuracy of the approach is compared to an alternative approach of adopting a noise robust loss function. Results show that modelling the noise distribution improves the accuracy of the baseline network in a similar capacity to the soft bootstrapping loss. |
First Page: | 234 |
Last Page: | 238 |
DOI: | https://doi.org/10.33682/zyc0-jw35 |
Type: | Article |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Singh_86.pdf | 462.94 kB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.