CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

Maninis, Kevis-Kokitsi; Popov, Stefan; Nießner, Matthias; Ferrari, Vittorio

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.09011 (cs)

[Submitted on 15 Jun 2023 (v1), last revised 14 Aug 2023 (this version, v2)]

Title:CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

Authors:Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari

View PDF

Abstract:We propose a method for annotating videos of complex multi-object scenes with a globally-consistent 3D representation of the objects. We annotate each object with a CAD model from a database, and place it in the 3D coordinate frame of the scene with a 9-DoF pose transformation. Our method is semi-automatic and works on commonly-available RGB videos, without requiring a depth sensor. Many steps are performed automatically, and the tasks performed by humans are simple, well-specified, and require only limited reasoning in 3D. This makes them feasible for crowd-sourcing and has allowed us to construct a large-scale dataset by annotating real-estate videos from YouTube. Our dataset CAD-Estate offers 101k instances of 12k unique CAD models placed in the 3D representations of 20k videos. In comparison to Scan2CAD, the largest existing dataset with CAD model annotations on real scenes, CAD-Estate has 7x more instances and 4x more unique CAD models. We showcase the benefits of pre-training a Mask2CAD model on CAD-Estate for the task of automatic 3D object reconstruction and pose estimation, demonstrating that it leads to performance improvements on the popular Scan2CAD benchmark. The dataset is available at this https URL.

Comments:	Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.09011 [cs.CV]
	(or arXiv:2306.09011v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.09011

Submission history

From: Kevis-Kokitsi Maninis [view email]
[v1] Thu, 15 Jun 2023 10:12:02 UTC (10,336 KB)
[v2] Mon, 14 Aug 2023 12:16:53 UTC (8,930 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators