DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Li, Caoshuo; Li, Tanzhe; Hu, Xiaobin; Luo, Donghao; Jin, Taisong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.14867 (cs)

[Submitted on 19 Mar 2025]

Title:DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Authors:Caoshuo Li, Tanzhe Li, Xiaobin Hu, Donghao Luo, Taisong Jin

View PDF HTML (experimental)

Abstract:Recently, Vision Graph Neural Network (ViG) has gained considerable attention in computer vision. Despite its groundbreaking innovation, Vision Graph Neural Network encounters key issues including the quadratic computational complexity caused by its K-Nearest Neighbor (KNN) graph construction and the limitation of pairwise relations of normal graphs. To address the aforementioned challenges, we propose a novel vision architecture, termed Dilated Vision HyperGraph Neural Network (DVHGNN), which is designed to leverage multi-scale hypergraph to efficiently capture high-order correlations among objects. Specifically, the proposed method tailors Clustering and Dilated HyperGraph Construction (DHGC) to adaptively capture multi-scale dependencies among the data samples. Furthermore, a dynamic hypergraph convolution mechanism is proposed to facilitate adaptive feature exchange and fusion at the hypergraph level. Extensive qualitative and quantitative evaluations of the benchmark image datasets demonstrate that the proposed DVHGNN significantly outperforms the state-of-the-art vision backbones. For instance, our DVHGNN-S achieves an impressive top-1 accuracy of 83.1% on ImageNet-1K, surpassing ViG-S by +1.0% and ViHGNN-S by +0.6%.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.14867 [cs.CV]
	(or arXiv:2503.14867v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.14867

Submission history

From: Caoshuo Li [view email]
[v1] Wed, 19 Mar 2025 03:45:23 UTC (9,730 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators