FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

Zhang, Zhanwei; Sun, Shizhao; Wang, Wenxiao; Cai, Deng; Bian, Jiang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.05823 (cs)

[Submitted on 5 Nov 2024]

Title:FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

Authors:Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai, Jiang Bian

View PDF HTML (experimental)

Abstract:Recently, there is a growing interest in creating computer-aided design (CAD) models based on user intent, known as controllable CAD generation. Existing work offers limited controllability and needs separate models for different types of control, reducing efficiency and practicality. To achieve controllable generation across all CAD construction hierarchies, such as sketch-extrusion, extrusion, sketch, face, loop and curve, we propose FlexCAD, a unified model by fine-tuning large language models (LLMs). First, to enhance comprehension by LLMs, we represent a CAD model as a structured text by abstracting each hierarchy as a sequence of text tokens. Second, to address various controllable generation tasks in a unified model, we introduce a hierarchy-aware masking strategy. Specifically, during training, we mask a hierarchy-aware field in the CAD text with a mask token. This field, composed of a sequence of tokens, can be set flexibly to represent various hierarchies. Subsequently, we ask LLMs to predict this masked field. During inference, the user intent is converted into a CAD text with a mask token replacing the part the user wants to modify, which is then fed into FlexCAD to generate new CAD models. Comprehensive experiments on public dataset demonstrate the effectiveness of FlexCAD in both generation quality and controllability. Code will be available at this https URL.

Comments:	23 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
Cite as:	arXiv:2411.05823 [cs.CV]
	(or arXiv:2411.05823v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.05823

Submission history

From: Zhanwei Zhang [view email]
[v1] Tue, 5 Nov 2024 05:45:26 UTC (11,931 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators