Unsupervised Graph-based Topic Modeling from Video Transcriptions

Stappen, Lukas; Hagerer, Gerhard; Schuller, Björn W.; Groh, Georg

Computer Science > Computation and Language

arXiv:2105.01466v1 (cs)

[Submitted on 4 May 2021 (this version), latest version 28 Oct 2021 (v4)]

Title:Unsupervised Graph-based Topic Modeling from Video Transcriptions

Authors:Lukas Stappen, Gerhard Hagerer, Björn W. Schuller, Georg Groh

View PDF

Abstract:To unfold the tremendous amount of audiovisual data uploaded daily to social media platforms, effective topic modelling techniques are needed. Existing work tends to apply variants of topic models on text data sets. In this paper, we aim at developing a topic extractor on video transcriptions. The model improves coherence by exploiting neural word embeddings through a graph-based clustering method. Unlike typical topic models, this approach works without knowing the true number of topics. Experimental results on the real-life multimodal data set MuSe-CaR demonstrates that our approach extracts coherent and meaningful topics, outperforming baseline methods. Furthermore, we successfully demonstrate the generalisability of our approach on a pure text review data set.

Comments:	JT and LS contributed equally to this work
Subjects:	Computation and Language (cs.CL); Multimedia (cs.MM)
Cite as:	arXiv:2105.01466 [cs.CL]
	(or arXiv:2105.01466v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2105.01466

Submission history

From: Lukas Stappen [view email]
[v1] Tue, 4 May 2021 12:48:17 UTC (2,186 KB)
[v2] Thu, 14 Oct 2021 18:37:53 UTC (2,638 KB)
[v3] Fri, 22 Oct 2021 10:17:07 UTC (2,759 KB)
[v4] Thu, 28 Oct 2021 12:17:33 UTC (2,762 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-05

Change to browse by:

cs
cs.MM

References & Citations

DBLP - CS Bibliography

listing | bibtex

Björn W. Schuller
Georg Groh

export BibTeX citation

Computer Science > Computation and Language

Title:Unsupervised Graph-based Topic Modeling from Video Transcriptions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Unsupervised Graph-based Topic Modeling from Video Transcriptions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators