Acoustic Scene Classification Using Deep Learning-based Ensemble Averaging

Title:	Acoustic Scene Classification Using Deep Learning-based Ensemble Averaging
Authors:	Huang, Jonathan Lu, Hong Lopez Meyer, Paulo Cordourier, Hector Del Hoyo Ontiveros, Juan
Date Issued:	Oct-2019
Citation:	J. Huang, H. Lu, P. Meyer, H. Cordourier & J. Ontiveros, "Acoustic Scene Classification Using Deep Learning-based Ensemble Averaging", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 94–98, New York University, NY, USA, Oct. 2019
Abstract:	In our submission to the DCASE 2019 Task 1a, we have explored the use of four different deep learning based neural networks architectures: Vgg12, ResNet50, AclNet, and AclSincNet. In order to improve performance, these four network architectures were pre-trained with Audioset data, and then fine-tuned over the development set for the task. The outputs produced by these networks, due to the diversity of feature front-end and of architecture differences, proved to be complementary when fused together. The ensemble of these models' outputs improved from best single model accuracy of 77.9% to 83.0% on the validation set, trained with the challenge default's development split. For the challenge's evaluation set, our best ensemble resulted in 81.3% of classification accuracy.
First Page:	94
Last Page:	98
DOI:	https://doi.org/10.33682/8rd2-g787
Type:	Article
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Huang_52.pdf	424.6 kB	Adobe PDF	View/Open