Hierarchical Sound Event Classification

Nichols, Eric; Tompkins, Daniel; Fan, Jianyu

doi:https://doi.org/10.33682/v0ns-1352

Full metadata record

DC Field	Value	Language
dc.contributor.author	Nichols, Eric
dc.contributor.author	Tompkins, Daniel
dc.contributor.author	Fan, Jianyu
dc.date.accessioned	2019-10-24T01:50:24Z	-
dc.date.available	2019-10-24T01:50:24Z	-
dc.date.issued	2019-10
dc.identifier.citation	E. Nichols, D. Tompkins & J. Fan, "Hierarchical Sound Event Classification", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 248–252, New York University, NY, USA, Oct. 2019	en
dc.identifier.uri	http://hdl.handle.net/2451/60770	-
dc.description.abstract	Task 5 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge is "urban sound tagging''. Given a set of known sound categories and sub-categories, the goal is to build a multi-label audio classification model to predict whether each sound category is present or absent in an audio recording. We developed a model composed of a preprocessing layer that converts audio to a log-mel spectrogram, a VGG-inspired Convolutional Neural Network (CNN) that generates an embedding for the spectrogram, a pre-trained VGGish network that generates a separate audio embedding, and finally a series of fully-connected layers that converts these two embeddings (concatenated) into a multi-label classification. This model directly outputs both “fine” and “coarse” labels; it treats the task as a 37-way multi-label classification problem. One version of this network did better at the coarse labels (CNN+VGGish1); another did better with fine labels on Micro AUPRC (CNN+VGGish2). A separate family of CNN models was also trained to take into account the hierarchical nature of the labels (Hierarchical1, Hierarchical2, and Hierarchical3). The hierarchical models perform better on Micro AUPRC with fine-level classification.	en
dc.rights	Copyright The Authors, 2019	en
dc.title	Hierarchical Sound Event Classification	en
dc.type	Article	en
dc.identifier.DOI	https://doi.org/10.33682/v0ns-1352
dc.description.firstPage	248
dc.description.lastPage	252
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Tompkins_64.pdf	523.61 kB	Adobe PDF	View/Open

Show simple item record