Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Meire, Maarten | |
dc.contributor.author | Karsmakers, Peter | |
dc.contributor.author | Vuegen, Lode | |
dc.date.accessioned | 2019-10-24T01:50:20Z | - |
dc.date.available | 2019-10-24T01:50:20Z | - |
dc.date.issued | 2019-10 | |
dc.identifier.citation | M. Meire, P. Karsmakers & L. Vuegen, "The Impact of Missing Labels and Overlapping Sound Events on Multi-label Multi-instance Learning for Sound Event Classification", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 159–163, New York University, NY, USA, Oct. 2019 | en |
dc.identifier.uri | http://hdl.handle.net/2451/60750 | - |
dc.description.abstract | Automated analysis of complex scenes of everyday sounds might help us navigate within the enormous amount of data and help us make better decisions based on the sounds around us. For this purpose classification models are required that translate raw audio to meaningful event labels. The specific task that this paper targets is that of learning sound event classifier models by a set of example sound segments that contain multiple potentially overlapping sound events and that are labeled with multiple weak sound event class names. This involves a combination of both multi-label and multi-instance learning. This paper investigates two state-of-theart methodologies that allow this type of learning, LRM-NMD and CNN. Besides comparing the accuracy in terms of correct sound event classifications, also the robustness to missing labels and to overlap of the sound events in the sound segments is evaluated. For small training set sizes LRM-NMD clearly outperforms CNN with an accuracy that is 40 to 50% higher. LRM-NMD does only minorly suffer from overlapping sound events during training while CNN suffers a substantial drop in classification accuracy, in the order of 10 to 20%, when sound events have a 100% overlap. Both methods show good robustness to missing labels. No matter how many labels are missing in a single segment (that contains multiple sound events) CNN converges to 97% accuracy when enough training data is available. LRM-NMD on the other hand shows a slight performance drop when the amount of missing labels increases. | en |
dc.rights | Distributed under the terms of the Creative Commons Attribution 4.0 International (CC-BY) license. | en |
dc.title | The Impact of Missing Labels and Overlapping Sound Events on Multi-label Multi-instance Learning for Sound Event Classification | en |
dc.type | Article | en |
dc.identifier.DOI | https://doi.org/10.33682/y8xs-0463 | |
dc.description.firstPage | 159 | |
dc.description.lastPage | 163 | |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Meire_22.pdf | 510.79 kB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.