A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation

Zhou, Xin; Shen, Zhiqi

doi:10.1145/3581783.3611943

Computer Science > Information Retrieval

arXiv:2211.06924 (cs)

[Submitted on 13 Nov 2022 (v1), last revised 23 Aug 2023 (this version, v3)]

Title:A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation

Authors:Xin Zhou, Zhiqi Shen

View PDF

Abstract:Multimodal recommender systems utilizing multimodal features (e.g., images and textual descriptions) typically show better recommendation accuracy than general recommendation models based solely on user-item interactions. Generally, prior work fuses multimodal features into item ID embeddings to enrich item representations, thus failing to capture the latent semantic item-item structures. In this context, LATTICE proposes to learn the latent structure between items explicitly and achieves state-of-the-art performance for multimodal recommendations. However, we argue the latent graph structure learning of LATTICE is both inefficient and unnecessary. Experimentally, we demonstrate that freezing its item-item structure before training can also achieve competitive performance. Based on this finding, we propose a simple yet effective model, dubbed as FREEDOM, that FREEzes the item-item graph and DenOises the user-item interaction graph simultaneously for Multimodal recommendation. Theoretically, we examine the design of FREEDOM through a graph spectral perspective and demonstrate that it possesses a tighter upper bound on the graph spectrum. In denoising the user-item interaction graph, we devise a degree-sensitive edge pruning method, which rejects possibly noisy edges with a high probability when sampling the graph. We evaluate the proposed model on three real-world datasets and show that FREEDOM can significantly outperform current strongest baselines. Compared with LATTICE, FREEDOM achieves an average improvement of 19.07% in recommendation accuracy while reducing its memory cost up to 6$\times$ on large graphs. The source code is available at: this https URL.

Comments:	Accepted to ACM Multimedia (MM) 2023
Subjects:	Information Retrieval (cs.IR); Multimedia (cs.MM)
Cite as:	arXiv:2211.06924 [cs.IR]
	(or arXiv:2211.06924v3 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2211.06924
Related DOI:	https://doi.org/10.1145/3581783.3611943

Submission history

From: Xin Zhou Dr. [view email]
[v1] Sun, 13 Nov 2022 15:11:03 UTC (192 KB)
[v2] Tue, 15 Nov 2022 16:12:46 UTC (191 KB)
[v3] Wed, 23 Aug 2023 04:02:28 UTC (287 KB)

Computer Science > Information Retrieval

Title:A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators