Inducing Hierarchical Compositional Model by Sparsifying Generator Network

Xing, Xianglei; Wu, Tianfu; Zhu, Song-Chun; Wu, Ying Nian

Computer Science > Computer Vision and Pattern Recognition

arXiv:1909.04324 (cs)

[Submitted on 10 Sep 2019 (v1), last revised 20 Jun 2020 (this version, v2)]

Title:Inducing Hierarchical Compositional Model by Sparsifying Generator Network

Authors:Xianglei Xing, Tianfu Wu, Song-Chun Zhu, Ying Nian Wu

View PDF

Abstract:This paper proposes to learn hierarchical compositional AND-OR model for interpretable image synthesis by sparsifying the generator network. The proposed method adopts the scene-objects-parts-subparts-primitives hierarchy in image representation. A scene has different types (i.e., OR) each of which consists of a number of objects (i.e., AND). This can be recursively formulated across the scene-objects-parts-subparts hierarchy and is terminated at the primitive level (e.g., wavelets-like basis). To realize this AND-OR hierarchy in image synthesis, we learn a generator network that consists of the following two components: (i) Each layer of the hierarchy is represented by an over-complete set of convolutional basis functions. Off-the-shelf convolutional neural architectures are exploited to implement the hierarchy. (ii) Sparsity-inducing constraints are introduced in end-to-end training, which induces a sparsely activated and sparsely connected AND-OR model from the initially densely connected generator network. A straightforward sparsity-inducing constraint is utilized, that is to only allow the top-$k$ basis functions to be activated at each layer (where $k$ is a hyper-parameter). The learned basis functions are also capable of image reconstruction to explain the input images. In experiments, the proposed method is tested on four benchmark datasets. The results show that meaningful and interpretable hierarchical representations are learned with better qualities of image synthesis and reconstruction obtained than baselines.

Comments:	This is the CVPR version
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
Cite as:	arXiv:1909.04324 [cs.CV]
	(or arXiv:1909.04324v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1909.04324

Submission history

From: Xianglei Xing [view email]
[v1] Tue, 10 Sep 2019 07:06:33 UTC (7,713 KB)
[v2] Sat, 20 Jun 2020 05:02:00 UTC (3,803 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Inducing Hierarchical Compositional Model by Sparsifying Generator Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Inducing Hierarchical Compositional Model by Sparsifying Generator Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators