OmniVox: Zero-Shot Emotion Recognition with Omni-LLMs

Murzaku, John; Rambow, Owen

Computer Science > Computation and Language

arXiv:2503.21480 (cs)

[Submitted on 27 Mar 2025 (v1), last revised 28 Mar 2025 (this version, v2)]

Title:OmniVox: Zero-Shot Emotion Recognition with Omni-LLMs

Authors:John Murzaku, Owen Rambow

View PDF HTML (experimental)

Abstract:The use of omni-LLMs (large language models that accept any modality as input), particularly for multimodal cognitive state tasks involving speech, is understudied. We present OmniVox, the first systematic evaluation of four omni-LLMs on the zero-shot emotion recognition task. We evaluate on two widely used multimodal emotion benchmarks: IEMOCAP and MELD, and find zero-shot omni-LLMs outperform or are competitive with fine-tuned audio models. Alongside our audio-only evaluation, we also evaluate omni-LLMs on text only and text and audio. We present acoustic prompting, an audio-specific prompting strategy for omni-LLMs which focuses on acoustic feature analysis, conversation context analysis, and step-by-step reasoning. We compare our acoustic prompting to minimal prompting and full chain-of-thought prompting techniques. We perform a context window analysis on IEMOCAP and MELD, and find that using context helps, especially on IEMOCAP. We conclude with an error analysis on the generated acoustic reasoning outputs from the omni-LLMs.

Comments:	Submitted to COLM 2025. Preprint
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2503.21480 [cs.CL]
	(or arXiv:2503.21480v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2503.21480

Submission history

From: John Murzaku [view email]
[v1] Thu, 27 Mar 2025 13:12:49 UTC (257 KB)
[v2] Fri, 28 Mar 2025 12:34:25 UTC (257 KB)

Computer Science > Computation and Language

Title:OmniVox: Zero-Shot Emotion Recognition with Omni-LLMs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:OmniVox: Zero-Shot Emotion Recognition with Omni-LLMs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators