Sound Event Classification and Detection with Weakly Labeled Data

Adavanne, Sharath; Fayek, Haytham; Tourbabin, Vladimir

doi:https://doi.org/10.33682/fx8n-cm43

Full metadata record

DC Field	Value	Language
dc.contributor.author	Adavanne, Sharath
dc.contributor.author	Fayek, Haytham
dc.contributor.author	Tourbabin, Vladimir
dc.date.accessioned	2019-10-24T01:50:22Z	-
dc.date.available	2019-10-24T01:50:22Z	-
dc.date.issued	2019-10
dc.identifier.citation	S. Adavanne, H. Fayek & V. Tourbabin, "Sound Event Classification and Detection with Weakly Labeled Data", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 15–19, New York University, NY, USA, Oct. 2019	en
dc.identifier.uri	http://hdl.handle.net/2451/60757	-
dc.description.abstract	The Sound Event Classification (SEC) task involves recognizing the set of active sound events in an audio recording. The Sound Event Detection (SED) task involves, in addition to SEC, detecting the temporal onset and offset of every sound event in an audio recording. Generally, SEC and SED are treated as supervised classification tasks that require labeled datasets. SEC only requires weak labels, i.e., annotation of active sound events, without the temporal information, whereas SED requires strong labels, i.e., annotation of the onset and offset times of every sound event, which makes annotation for SED more tedious than for SEC. In this paper, we propose two methods for joint SEC and SED using weakly labeled data: a Fully Convolutional Network (FCN) and a novel method that combines a Convolutional Neural Network with an attention layer (CNNatt). Unlike most prior work, the proposed methods do not assume that the weak labels are active during the entire recording and can scale to large datasets. We report state-of-the-art SEC results obtained with the largest weakly labeled dataset - Audioset	en
dc.rights	Copyright The Authors, 2019	en
dc.title	Sound Event Classification and Detection with Weakly Labeled Data	en
dc.type	Article	en
dc.identifier.DOI	https://doi.org/10.33682/fx8n-cm43
dc.description.firstPage	15
dc.description.lastPage	19
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Adavanne_45.pdf	550.47 kB	Adobe PDF	View/Open

Show simple item record