Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

Tompkins, Daniel; Kumar, Kshitiz; Wu, Jian

Computer Science > Sound

arXiv:2202.03514 (cs)

[Submitted on 7 Feb 2022]

Title:Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

Authors:Daniel Tompkins, Kshitiz Kumar, Jian Wu

View PDF

Abstract:An Xception model reaches state-of-the-art (SOTA) accuracy on the ESC-50 dataset for audio event detection through knowledge transfer from ImageNet weights, pretraining on AudioSet, and an on-the-fly data augmentation pipeline. This paper presents an ablation study that analyzes which components contribute to the boost in performance and training time. A smaller Xception model is also presented which nears SOTA performance with almost a third of the parameters.

Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2202.03514 [cs.SD]
	(or arXiv:2202.03514v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2202.03514

Submission history

From: Daniel Tompkins [view email]
[v1] Mon, 7 Feb 2022 20:57:40 UTC (253 KB)

Full-text links:

Access Paper:

view license

Current browse context:

eess

< prev | next >

new | recent | 2022-02

Change to browse by:

cs
cs.LG
cs.SD
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Kshitiz Kumar
Jian Wu

export BibTeX citation

Computer Science > Sound

Title:Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators