InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior

Lin, Chenguo; Lin, Yuchen; Pan, Panwang; Zhang, Xuanyang; Mu, Yadong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.07580 (cs)

[Submitted on 10 Jul 2024 (v1), last revised 11 Jul 2024 (this version, v2)]

Title:InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior

Authors:Chenguo Lin, Yuchen Lin, Panwang Pan, Xuanyang Zhang, Yadong Mu

View PDF HTML (experimental)

Abstract:Comprehending natural language instructions is a charming property for both 2D and 3D layout synthesis systems. Existing methods implicitly model object joint distributions and express object relations, hindering generation's controllability. We introduce InstructLayout, a novel generative framework that integrates a semantic graph prior and a layout decoder to improve controllability and fidelity for 2D and 3D layout synthesis. The proposed semantic graph prior learns layout appearances and object distributions simultaneously, demonstrating versatility across various downstream tasks in a zero-shot manner. To facilitate the benchmarking for text-driven 2D and 3D scene synthesis, we respectively curate two high-quality datasets of layout-instruction pairs from public Internet resources with large language and multimodal models. Extensive experimental results reveal that the proposed method outperforms existing state-of-the-art approaches by a large margin in both 2D and 3D layout synthesis tasks. Thorough ablation studies confirm the efficacy of crucial design components.

Comments:	This paper is an extension of ICLR 2024 "InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior". arXiv admin note: substantial text overlap with arXiv:2402.04717
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.07580 [cs.CV]
	(or arXiv:2407.07580v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.07580

Submission history

From: Yuchen Lin [view email]
[v1] Wed, 10 Jul 2024 12:13:39 UTC (35,869 KB)
[v2] Thu, 11 Jul 2024 03:19:08 UTC (35,869 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators