Masked Conditional Neural Networks for Automatic Sound Events Recognition

Medhat, Fady; Chesmore, David; Robinson, John

doi:10.1109/DSAA.2017.43

Computer Science > Machine Learning

arXiv:1802.05792 (cs)

[Submitted on 15 Feb 2018 (v1), last revised 28 Apr 2019 (this version, v2)]

Title:Masked Conditional Neural Networks for Automatic Sound Events Recognition

Authors:Fady Medhat, David Chesmore, John Robinson

View PDF

Abstract:Deep neural network architectures designed for application domains other than sound, especially image recognition, may not optimally harness the time-frequency representation when adapted to the sound recognition problem. In this work, we explore the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN) for multi-dimensional temporal signal recognition. The CLNN considers the inter-frame relationship, and the MCLNN enforces a systematic sparseness over the network's links to enable learning in frequency bands rather than bins allowing the network to be frequency shift invariant mimicking a filterbank. The mask also allows considering several combinations of features concurrently, which is usually handcrafted through exhaustive manual search. We applied the MCLNN to the environmental sound recognition problem using the ESC-10 and ESC-50 datasets. MCLNN achieved competitive performance, using 12% of the parameters and without augmentation, compared to state-of-the-art Convolutional Neural Networks.

Comments:	Restricted Boltzmann Machine, RBM, Conditional RBM, CRBM, Deep Belief Net, DBN, Conditional Neural Network, CLNN, Masked Conditional Neural Network, MCLNN, Environmental Sound Recognition, ESR
Subjects:	Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Cite as:	arXiv:1802.05792 [cs.LG]
	(or arXiv:1802.05792v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.05792
Journal reference:	IEEE International Conference on Data Science and Advanced Analytics (DSAA) Year: 2017, Pages: 389 - 394
Related DOI:	https://doi.org/10.1109/DSAA.2017.43

Submission history

From: Fady Medhat [view email]
[v1] Thu, 15 Feb 2018 23:24:39 UTC (741 KB)
[v2] Sun, 28 Apr 2019 09:53:24 UTC (1,664 KB)

Computer Science > Machine Learning

Title:Masked Conditional Neural Networks for Automatic Sound Events Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Masked Conditional Neural Networks for Automatic Sound Events Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators