CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers?

Mallis, Dimitrios; Karadeniz, Ahmet Serdar; Cavada, Sebastian; Rukhovich, Danila; Foteinopoulou, Niki; Cherenkova, Kseniya; Kacem, Anis; Aouada, Djamila

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.13810 (cs)

[Submitted on 18 Dec 2024]

Title:CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers?

Authors:Dimitrios Mallis, Ahmet Serdar Karadeniz, Sebastian Cavada, Danila Rukhovich, Niki Foteinopoulou, Kseniya Cherenkova, Anis Kacem, Djamila Aouada

View PDF HTML (experimental)

Abstract:We propose CAD-Assistant, a general-purpose CAD agent for AI-assisted design. Our approach is based on a powerful Vision and Large Language Model (VLLM) as a planner and a tool-augmentation paradigm using CAD-specific modules. CAD-Assistant addresses multimodal user queries by generating actions that are iteratively executed on a Python interpreter equipped with the FreeCAD software, accessed via its Python API. Our framework is able to assess the impact of generated CAD commands on geometry and adapts subsequent actions based on the evolving state of the CAD design. We consider a wide range of CAD-specific tools including Python libraries, modules of the FreeCAD Python API, helpful routines, rendering functions and other specialized modules. We evaluate our method on multiple CAD benchmarks and qualitatively demonstrate the potential of tool-augmented VLLMs as generic CAD task solvers across diverse CAD workflows.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:2412.13810 [cs.CV]
	(or arXiv:2412.13810v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.13810

Submission history

From: Dimitrios Mallis Dr [view email]
[v1] Wed, 18 Dec 2024 12:57:56 UTC (30,280 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators