DoodleFormer: Creative Sketch Drawing with Transformers

Bhunia, Ankan Kumar; Khan, Salman; Cholakkal, Hisham; Anwer, Rao Muhammad; Khan, Fahad Shahbaz; Laaksonen, Jorma; Felsberg, Michael

Computer Science > Computer Vision and Pattern Recognition

arXiv:2112.03258 (cs)

[Submitted on 6 Dec 2021 (v1), last revised 15 Sep 2022 (this version, v3)]

Title:DoodleFormer: Creative Sketch Drawing with Transformers

Authors:Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

View PDF

Abstract:Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in terms of Fr`echet inception distance (FID) over the state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation and sketch completion.

Comments:	Accepted to ECCV-2022. Project webpage: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2112.03258 [cs.CV]
	(or arXiv:2112.03258v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2112.03258

Submission history

From: Ankan Kumar Bhunia [view email]
[v1] Mon, 6 Dec 2021 18:59:59 UTC (6,291 KB)
[v2] Sat, 9 Jul 2022 06:21:04 UTC (8,082 KB)
[v3] Thu, 15 Sep 2022 17:59:49 UTC (8,083 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DoodleFormer: Creative Sketch Drawing with Transformers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DoodleFormer: Creative Sketch Drawing with Transformers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators