Alignment Helps Make the Most of Multimodal Data

Arnold, Christian; Küpfer, Andreas

Computer Science > Computation and Language

arXiv:2405.08454 (cs)

[Submitted on 14 May 2024 (v1), last revised 8 Jul 2024 (this version, v2)]

Title:Alignment Helps Make the Most of Multimodal Data

Authors:Christian Arnold, Andreas Küpfer

View PDF HTML (experimental)

Abstract:When studying political communication, combining the information from text, audio, and video signals promises to reflect the richness of human communication more comprehensively than confining it to individual modalities alone. However, its heterogeneity, connectedness, and interaction are challenging to address when modeling such multimodal data. We argue that aligning the respective modalities can be an essential step in entirely using the potential of multimodal data because it informs the model with human understanding. Taking care of the data-generating process of multimodal data, our framework proposes four principles to organize alignment and, thus, address the challenges of multimodal data. We illustrate the utility of these principles by analyzing how German MPs address members of the far-right AfD in their speeches and predicting the tone of video advertising in the context of the 2020 US presidential race. Our paper offers important insights to all keen to analyze multimodal data effectively.

Comments:	Working Paper
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.08454 [cs.CL]
	(or arXiv:2405.08454v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.08454

Submission history

From: Andreas Küpfer [view email]
[v1] Tue, 14 May 2024 09:20:59 UTC (451 KB)
[v2] Mon, 8 Jul 2024 11:50:19 UTC (4,565 KB)

Computer Science > Computation and Language

Title:Alignment Helps Make the Most of Multimodal Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Alignment Helps Make the Most of Multimodal Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators