Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

Lebourdais, Martin; Mariotte, Théo; Almudévar, Antonio; Tahon, Marie; Ortega, Alfonso

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2406.13385 (eess)

[Submitted on 19 Jun 2024]

Title:Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

Authors:Martin Lebourdais, Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega

View PDF HTML (experimental)

Abstract:Audio segmentation is a key task for many speech technologies, most of which are based on neural networks, usually considered as black boxes, with high-level performances. However, in many domains, among which health or forensics, there is not only a need for good performance but also for explanations about the output decision. Explanations derived directly from latent representations need to satisfy "good" properties, such as informativeness, compactness, or modularity, to be interpretable. In this article, we propose an explainable-by-design audio segmentation model based on non-negative matrix factorization (NMF) which is a good candidate for the design of interpretable representations. This paper shows that our model reaches good segmentation performances, and presents deep analyses of the latent representation extracted from the non-negative matrix. The proposed approach opens new perspectives toward the evaluation of interpretable representations according to "good" properties.

Comments:	Accepted at Interspeech 2024, 5 pages, 2 figures, 3 tables
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
Cite as:	arXiv:2406.13385 [eess.AS]
	(or arXiv:2406.13385v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2406.13385

Submission history

From: Théo Mariotte [view email]
[v1] Wed, 19 Jun 2024 09:26:33 UTC (492 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators