Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Singh, Arshdeep | |
dc.contributor.author | Rajan, Padmanabhan | |
dc.contributor.author | Bhavsar, Arnav | |
dc.date.accessioned | 2019-10-24T01:50:23Z | - |
dc.date.available | 2019-10-24T01:50:23Z | - |
dc.date.issued | 2019-10 | |
dc.identifier.citation | A. Singh, P. Rajan & A. Bhavsar, "Deep Multi-view Features from Raw Audio for Acoustic Scene Classification", Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pages 229–233, New York University, NY, USA, Oct. 2019 | en |
dc.identifier.uri | http://hdl.handle.net/2451/60765 | - |
dc.description.abstract | In this paper, we propose a feature representation framework which captures features constituting different levels of abstraction for audio scene classification. A pre-trained deep convolution neural network, SoundNet, is used to extract the features from various intermediate layers corresponding to an audio file. We consider that the features obtained from various intermediate layers provide the different types of abstraction and exhibits complementary information. Thus, combining the intermediate features of various layers can improve the classification performance to discriminate audio scenes. To obtain the representations, we ignore redundant filters in the intermediate layers using analysis of variance based redundancy removal framework. This reduces dimensionality and computational complexity. Next, shift-invariant fixed-length compressed representations across layers are obtained by aggregating the responses of the important filters only. The obtained compressed representations are stacked altogether to obtain a supervector. Finally, we employ the classification using multi-layer perceptron and support vector machine models. We comprehensively perform the validation of the above assumption on two public datasets; Making Sense of Sounds and open set acoustic scene classification DCASE 2019. | en |
dc.rights | Distributed under the terms of the Creative Commons Attribution 4.0 International (CC-BY) license. | en |
dc.title | Deep Multi-view Features from Raw Audio for Acoustic Scene Classification | en |
dc.type | Article | en |
dc.identifier.DOI | https://doi.org/10.33682/05gk-pd08 | |
dc.description.firstPage | 229 | |
dc.description.lastPage | 233 | |
Appears in Collections: | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
DCASE2019Workshop_Singh_32.pdf | 1.6 MB | Adobe PDF | View/Open |
Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.