HODGEPODGE: Sound Event Detection Based on Ensemble of Semi-Supervised Learning Methods

Shi, Ziqiang; Liu, Liu; Lin, Huibin; Liu, Rujie; Shi, Anyan

doi:https://doi.org/10.33682/9kcj-bq06

Title:	HODGEPODGE: Sound Event Detection Based on Ensemble of Semi-Supervised Learning Methods
Authors:	Shi, Ziqiang Liu, Liu Lin, Huibin Liu, Rujie Shi, Anyan
Date Issued:	Oct-2019
Citation:	Z. Shi, L. Liu, H. Lin, R. Liu & A. Shi, "HODGEPODGE: Sound Event Detection Based on Ensemble of Semi-Supervised Learning Methods", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 224–228, New York University, NY, USA, Oct. 2019
Abstract:	In this paper, we present a method called HODGEPODGE\footnotemark[1] for large-scale detection of sound events using weakly labeled, synthetic, and unlabeled data proposed in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge Task 4: Sound event detection in domestic environments. To perform this task, we adopted the convolutional recurrent neural networks (CRNN) as our backbone network. In order to deal with a small amount of tagged data and a large amounts of unlabeled in-domain data, we aim to focus primarily on how to apply semi-supervise learning methods efficiently to make full use of limited data. Three semi-supervised learning principles have been used in our system, including: 1) Consistency regularization applies data augmentation; 2) MixUp regularizer requiring that the predictions for a interpolation of two inputs is close to the interpolation of the prediction for each individual input; 3) MixUp regularization applies to interpolation between data augmentations. We also tried an ensemble of various models, which are trained by using different semi-supervised learning principles. Our proposed approach significantly improved the performance of the baseline, achieving the event-based f-measure of 42.0\% compared to 25.8\% event-based f-measure of the baseline in the provided official evaluation dataset. Our submissions ranked third among 18 teams in the task 4.
First Page:	224
Last Page:	228
DOI:	https://doi.org/10.33682/9kcj-bq06
Type:	Article
Appears in Collections:	Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019)

Files in This Item:

File	Size	Format
DCASE2019Workshop_Shi_15.pdf	557.1 kB	Adobe PDF	View/Open

Show full item record