S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field

Liang, Zixi; Xu, Guowei; Wu, Haifeng; Huang, Ye; Li, Wen; Duan, Lixin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.17561 (cs)

[Submitted on 23 Dec 2024 (v1), last revised 4 Jan 2025 (this version, v2)]

Title:S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field

Authors:Zixi Liang, Guowei Xu, Haifeng Wu, Ye Huang, Wen Li, Lixin Duan

View PDF HTML (experimental)

Abstract:Learning-based methods have become increasingly popular in 3D indoor scene synthesis (ISS), showing superior performance over traditional optimization-based approaches. These learning-based methods typically model distributions on simple yet explicit scene representations using generative models. However, due to the oversimplified explicit representations that overlook detailed information and the lack of guidance from multimodal relationships within the scene, most learning-based methods struggle to generate indoor scenes with realistic object arrangements and styles. In this paper, we introduce a new method, Scene Implicit Neural Field (S-INF), for indoor scene synthesis, aiming to learn meaningful representations of multimodal relationships, to enhance the realism of indoor scene synthesis. S-INF assumes that the scene layout is often related to the object-detailed information. It disentangles the multimodal relationships into scene layout relationships and detailed object relationships, fusing them later through implicit neural fields (INFs). By learning specialized scene layout relationships and projecting them into S-INF, we achieve a realistic generation of scene layout. Additionally, S-INF captures dense and detailed object relationships through differentiable rendering, ensuring stylistic consistency across objects. Through extensive experiments on the benchmark 3D-FRONT dataset, we demonstrate that our method consistently achieves state-of-the-art performance under different types of ISS.

Comments:	Accepted to AAAI 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.17561 [cs.CV]
	(or arXiv:2412.17561v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.17561

Submission history

From: Zixi Liang [view email]
[v1] Mon, 23 Dec 2024 13:29:35 UTC (2,073 KB)
[v2] Sat, 4 Jan 2025 08:42:33 UTC (2,073 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators