ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization

Liu, Fuwei

Abstract:Generating multi-subject stylized images remains a significant challenge due to the ambiguity in defining style attributes (e.g., color, texture, atmosphere, and structure) and the difficulty in consistently applying them across multiple subjects. Although recent diffusion-based text-to-image models have achieved remarkable progress, existing methods typically rely on computationally expensive inversion procedures or large-scale stylized datasets. Moreover, these methods often struggle with maintaining multi-subject semantic fidelity and are limited by high inference costs. To address these limitations, we propose ICAS (IP-Adapter and ControlNet-based Attention Structure), a novel framework for efficient and controllable multi-subject style transfer. Instead of full-model tuning, ICAS adaptively fine-tunes only the content injection branch of a pre-trained diffusion model, thereby preserving identity-specific semantics while enhancing style controllability. By combining IP-Adapter for adaptive style injection with ControlNet for structural conditioning, our framework ensures faithful global layout preservation alongside accurate local style synthesis. Furthermore, ICAS introduces a cyclic multi-subject content embedding mechanism, which enables effective style transfer under limited-data settings without the need for extensive stylized corpora. Extensive experiments show that ICAS achieves superior performance in structure preservation, style consistency, and inference efficiency, establishing a new paradigm for multi-subject style transfer in real-world applications.

Comments:	10 pages, 6 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2504.13224 [cs.CV]
	(or arXiv:2504.13224v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2504.13224

Computer Science > Computer Vision and Pattern Recognition

Title:ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators