Receptive-Field-Regularized CNN Variants for Acoustic Scene Classification

Koutini, Khaled; Eghbal-zadeh, Hamid; Widmer, Gerhard

doi:https://doi.org/10.33682/cjd9-kc43

Title:	Receptive-Field-Regularized CNN Variants for Acoustic Scene Classification
Authors:	Koutini, Khaled Eghbal-zadeh, Hamid Widmer, Gerhard
Date Issued:	Oct-2019
Citation:	K. Koutini, H. Eghbal-zadeh & G. Widmer, "Receptive-Field-Regularized CNN Variants for Acoustic Scene Classification", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 124–128, New York University, NY, USA, Oct. 2019
Abstract:	Acoustic scene classification and related tasks have been dominated by Convolutional Neural Networks (CNNs). Topperforming CNNs use mainly audio spectograms as input and borrow their architectural design primarily from computer vision. A recent study has shown that restricting the receptive field (RF) of CNNs in appropriate ways is crucial for their performance, robustness and generalization in audio tasks. One side effect of restricting the RF of CNNs is that more frequency information is lost. In this paper, we perform a systematic investigation of different RF configuration for various CNN architectures on the DCASE 2019 Task 1.A dataset. Second, we introduce Frequency Aware CNNs to compensate for the lack of frequency information caused by the restricted RF, and experimentally determine if and in what RF ranges they yield additional improvement. The result of these investigations are several well-performing submissions to different tasks in the DCASE 2019 Challenge.
First Page:	124
Last Page:	128
DOI:	https://doi.org/10.33682/cjd9-kc43
Type:	Article
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Koutini_55.pdf	550.36 kB	Adobe PDF	View/Open

Show full item record