SocialGen: Modeling Multi-Human Social Interaction with Language Models

Yu, Heng; Zhang, Juze; Chen, Changan; Xiang, Tiange; Fang, Yusu; Niebles, Juan Carlos; Adeli, Ehsan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.22906 (cs)

[Submitted on 28 Mar 2025]

Title:SocialGen: Modeling Multi-Human Social Interaction with Language Models

Authors:Heng Yu, Juze Zhang, Changan Chen, Tiange Xiang, Yusu Fang, Juan Carlos Niebles, Ehsan Adeli

View PDF HTML (experimental)

Abstract:Human interactions in everyday life are inherently social, involving engagements with diverse individuals across various contexts. Modeling these social interactions is fundamental to a wide range of real-world applications. In this paper, we introduce SocialGen, the first unified motion-language model capable of modeling interaction behaviors among varying numbers of individuals, to address this crucial yet challenging problem. Unlike prior methods that are limited to two-person interactions, we propose a novel social motion representation that supports tokenizing the motions of an arbitrary number of individuals and aligning them with the language space. This alignment enables the model to leverage rich, pretrained linguistic knowledge to better understand and reason about human social behaviors. To tackle the challenges of data scarcity, we curate a comprehensive multi-human interaction dataset, SocialX, enriched with textual annotations. Leveraging this dataset, we establish the first comprehensive benchmark for multi-human interaction tasks. Our method achieves state-of-the-art performance across motion-language tasks, setting a new standard for multi-human interaction modeling.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.22906 [cs.CV]
	(or arXiv:2503.22906v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.22906

Submission history

From: Heng Yu [view email]
[v1] Fri, 28 Mar 2025 22:57:25 UTC (18,322 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SocialGen: Modeling Multi-Human Social Interaction with Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SocialGen: Modeling Multi-Human Social Interaction with Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators