Skip navigation
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNichols, Eric
dc.contributor.authorTompkins, Daniel
dc.contributor.authorFan, Jianyu
dc.date.accessioned2019-10-24T01:50:24Z-
dc.date.available2019-10-24T01:50:24Z-
dc.date.issued2019-10
dc.identifier.citationE. Nichols, D. Tompkins & J. Fan, "Hierarchical Sound Event Classification", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 248–252, New York University, NY, USA, Oct. 2019en
dc.identifier.urihttp://hdl.handle.net/2451/60770-
dc.description.abstractTask 5 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge is "urban sound tagging''. Given a set of known sound categories and sub-categories, the goal is to build a multi-label audio classification model to predict whether each sound category is present or absent in an audio recording. We developed a model composed of a preprocessing layer that converts audio to a log-mel spectrogram, a VGG-inspired Convolutional Neural Network (CNN) that generates an embedding for the spectrogram, a pre-trained VGGish network that generates a separate audio embedding, and finally a series of fully-connected layers that converts these two embeddings (concatenated) into a multi-label classification. This model directly outputs both “fine” and “coarse” labels; it treats the task as a 37-way multi-label classification problem. One version of this network did better at the coarse labels (CNN+VGGish1); another did better with fine labels on Micro AUPRC (CNN+VGGish2). A separate family of CNN models was also trained to take into account the hierarchical nature of the labels (Hierarchical1, Hierarchical2, and Hierarchical3). The hierarchical models perform better on Micro AUPRC with fine-level classification.en
dc.rightsCopyright The Authors, 2019en
dc.titleHierarchical Sound Event Classificationen
dc.typeArticleen
dc.identifier.DOIhttps://doi.org/10.33682/v0ns-1352
dc.description.firstPage248
dc.description.lastPage252
Appears in Collections:Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:
File SizeFormat 
DCASE2019Workshop_Tompkins_64.pdf523.61 kBAdobe PDFView/Open


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.