Skip navigation
Title: 

Acoustic Scene Classification Using Deep Learning-based Ensemble Averaging

Authors: Huang, Jonathan
Lu, Hong
Lopez Meyer, Paulo
Cordourier, Hector
Del Hoyo Ontiveros, Juan
Date Issued: Oct-2019
Citation: J. Huang, H. Lu, P. Meyer, H. Cordourier & J. Ontiveros, "Acoustic Scene Classification Using Deep Learning-based Ensemble Averaging", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 94–98, New York University, NY, USA, Oct. 2019
Abstract: In our submission to the DCASE 2019 Task 1a, we have explored the use of four different deep learning based neural networks architectures: Vgg12, ResNet50, AclNet, and AclSincNet. In order to improve performance, these four network architectures were pre-trained with Audioset data, and then fine-tuned over the development set for the task. The outputs produced by these networks, due to the diversity of feature front-end and of architecture differences, proved to be complementary when fused together. The ensemble of these models' outputs improved from best single model accuracy of 77.9% to 83.0% on the validation set, trained with the challenge default's development split. For the challenge's evaluation set, our best ensemble resulted in 81.3% of classification accuracy.
First Page: 94
Last Page: 98
DOI: https://doi.org/10.33682/8rd2-g787
Type: Article
Appears in Collections:Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:
File SizeFormat 
DCASE2019Workshop_Huang_52.pdf424.6 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.