Computer Science > Computer Vision and Pattern Recognition
[Submitted on 8 Mar 2025]
Title:From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning
View PDF HTML (experimental)Abstract:LiDAR-based 3D object detection datasets have been pivotal for autonomous driving, yet they cover a limited range of objects, restricting the model's generalization across diverse deployment environments. To address this, we introduce the first generalized cross-domain few-shot (GCFS) task in 3D object detection, which focuses on adapting a source-pretrained model for high performance on both common and novel classes in a target domain with few-shot samples. Our solution integrates multi-modal fusion and contrastive-enhanced prototype learning within one framework, holistically overcoming challenges related to data scarcity and domain adaptation in the GCFS setting. The multi-modal fusion module utilizes 2D vision-language models to extract rich, open-set semantic knowledge. To address biases in point distributions across varying structural complexities, we particularly introduce a physically-aware box searching strategy that leverages laser imaging principles to generate high-quality 3D box proposals from 2D insights, enhancing object recall. To effectively capture domain-specific representations for each class from limited target data, we further propose a contrastive-enhanced prototype learning, which strengthens the model's adaptability. We evaluate our approach with three GCFS benchmark settings, and extensive experiments demonstrate the effectiveness of our solution for GCFS tasks. The code will be publicly available.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.