Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Yang, Juncheng; Li, Zuchao; Xie, Shuai; Yu, Wei; Li, Shijun; Du, Bo

Computer Science > Artificial Intelligence

arXiv:2404.04538 (cs)

[Submitted on 6 Apr 2024]

Title:Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Authors:Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li, Bo Du

View PDF HTML (experimental)

Abstract:The chain-of-thought technique has been received well in multi-modal tasks. It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts. However, human thought processes are predominantly non-linear, as they encompass multiple aspects simultaneously and employ dynamic adjustment and updating mechanisms. Therefore, we propose a novel Aggregation-Graph-of-Thought (AGoT) mechanism for soft-prompt tuning in multi-modal representation learning. The proposed AGoT models the human thought process not only as a chain but also models each step as a reasoning aggregation graph to cope with the overlooked multiple aspects of thinking in single-step reasoning. This turns the entire reasoning process into prompt aggregation and prompt flow operations. Experiments show that our multi-modal model enhanced with AGoT soft-prompting achieves good results in several tasks such as text-image retrieval, visual question answering, and image recognition. In addition, we demonstrate that it has good domain generalization performance due to better reasoning.

Comments:	This paper is accepted to LREC-COLING 2024
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2404.04538 [cs.AI]
	(or arXiv:2404.04538v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2404.04538

Submission history

From: Juncheng Yang [view email]
[v1] Sat, 6 Apr 2024 07:39:44 UTC (2,115 KB)

Computer Science > Artificial Intelligence

Title:Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators