Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

Yin, Dacheng; Ren, Xuanchi; Luo, Chong; Wang, Yuwang; Xiong, Zhiwei; Zeng, Wenjun

Computer Science > Machine Learning

arXiv:2202.12307 (cs)

[Submitted on 24 Feb 2022]

Title:Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

Authors:Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng

View PDF

Abstract:This paper addresses the unsupervised learning of content-style decomposed representation. We first give a definition of style and then model the content-style representation as a token-level bipartite graph. An unsupervised framework, named Retriever, is proposed to learn such representations. First, a cross-attention module is employed to retrieve permutation invariant (P.I.) information, defined as style, from the input data. Second, a vector quantization (VQ) module is used, together with man-induced constraints, to produce interpretable content tokens. Last, an innovative link attention module serves as the decoder to reconstruct data from the decomposed content and style, with the help of the linking keys. Being modal-agnostic, the proposed Retriever is evaluated in both speech and image domains. The state-of-the-art zero-shot voice conversion performance confirms the disentangling ability of our framework. Top performance is also achieved in the part discovery task for images, verifying the interpretability of our representation. In addition, the vivid part-based style transfer quality demonstrates the potential of Retriever to support various fascinating generative tasks. Project page at this https URL.

Comments:	Accepted to ICLR 2022. Project page at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2202.12307 [cs.LG]
	(or arXiv:2202.12307v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.12307

Submission history

From: Xuanchi Ren [view email]
[v1] Thu, 24 Feb 2022 19:00:03 UTC (38,610 KB)

Computer Science > Machine Learning

Title:Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators