Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Elizarov, Slava; Rowles, Ciara; Donné, Simon

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.03718 (cs)

[Submitted on 5 Sep 2024]

Title:Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Authors:Slava Elizarov, Ciara Rowles, Simon Donné

View PDF HTML (experimental)

Abstract:Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically meaningful, separate parts and include internal structures, enhancing both usability and versatility.

Comments:	11 pages, 9 figures, Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2409.03718 [cs.CV]
	(or arXiv:2409.03718v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.03718

Submission history

From: Slava Elizarov [view email]
[v1] Thu, 5 Sep 2024 17:21:54 UTC (47,800 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators