Urban Sound Tagging using Convolutional Neural Networks

Adapa, Sainath

doi:https://doi.org/10.33682/8axe-9243

Title:	Urban Sound Tagging using Convolutional Neural Networks
Authors:	Adapa, Sainath
Date Issued:	Oct-2019
Citation:	S. Adapa, "Urban Sound Tagging using Convolutional Neural Networks", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 5–9, New York University, NY, USA, Oct. 2019
Abstract:	In this paper, we propose a framework for environmental sound classification in a low-data context (less than 100 labeled examples per class). We show that using pre-trained image classification models along with usage of data augmentation techniques results in higher performance over alternative approaches. We applied this system to the task of Urban Sound Tagging, part of the DCASE 2019. The objective was to label different sources of noise from raw audio data. A modified form of MobileNetV2, a convolutional neural network (CNN) model was trained to classify both coarse and fine tags jointly. The proposed model uses log-scaled Mel-spectrogram as the representation format for the audio data. Mixup, Random erasing, scaling, and shifting are used as data augmentation techniques. A second model that uses scaled labels was built to account for human errors in the annotations. The proposed model achieved the first rank on the leaderboard with Micro-AUPRC values of 0.751 and 0.860 on fine and coarse tags, respectively.
First Page:	5
Last Page:	9
DOI:	https://doi.org/10.33682/8axe-9243
Type:	Article
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Adapa_83.pdf	733.48 kB	Adobe PDF	View/Open

Show full item record