Stylus: Automatic Adapter Selection for Diffusion Models

Luo, Michael; Wong, Justin; Trabucco, Brandon; Huang, Yanping; Gonzalez, Joseph E.; Chen, Zhifeng; Salakhutdinov, Ruslan; Stoica, Ion

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.18928 (cs)

[Submitted on 29 Apr 2024]

Title:Stylus: Automatic Adapter Selection for Diffusion Models

Authors:Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica

View PDF HTML (experimental)

Abstract:Beyond scaling base models with more data or parameters, fine-tuned adapters provide an alternative way to generate high fidelity, custom images at reduced costs. As such, adapters have been widely adopted by open-source communities, accumulating a database of over 100K adapters-most of which are highly customized with insufficient descriptions. This paper explores the problem of matching the prompt to a set of relevant adapters, built on recent work that highlight the performance gains of composing adapters. We introduce Stylus, which efficiently selects and automatically composes task-specific adapters based on a prompt's keywords. Stylus outlines a three-stage approach that first summarizes adapters with improved descriptions and embeddings, retrieves relevant adapters, and then further assembles adapters based on prompts' keywords by checking how well they fit the prompt. To evaluate Stylus, we developed StylusDocs, a curated dataset featuring 75K adapters with pre-computed adapter embeddings. In our evaluation on popular Stable Diffusion checkpoints, Stylus achieves greater CLIP-FID Pareto efficiency and is twice as preferred, with humans and multimodal models as evaluators, over the base model. See this http URL for more.

Comments:	Project Website: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Graphics (cs.GR); Machine Learning (cs.LG)
Cite as:	arXiv:2404.18928 [cs.CV]
	(or arXiv:2404.18928v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.18928

Submission history

From: Michael Luo Zhiyu [view email]
[v1] Mon, 29 Apr 2024 17:59:16 UTC (32,060 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Stylus: Automatic Adapter Selection for Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Stylus: Automatic Adapter Selection for Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators