Improving Food Image Recognition with Noisy Vision Transformer

Ghosh, Tonmoy; Sazonov, Edward

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.18997 (cs)

[Submitted on 24 Mar 2025]

Title:Improving Food Image Recognition with Noisy Vision Transformer

Authors:Tonmoy Ghosh, Edward Sazonov

View PDF

Abstract:Food image recognition is a challenging task in computer vision due to the high variability and complexity of food images. In this study, we investigate the potential of Noisy Vision Transformers (NoisyViT) for improving food classification performance. By introducing noise into the learning process, NoisyViT reduces task complexity and adjusts the entropy of the system, leading to enhanced model accuracy. We fine-tune NoisyViT on three benchmark datasets: Food2K (2,000 categories, ~1M images), Food-101 (101 categories, ~100K images), and CNFOOD-241 (241 categories, ~190K images). The performance of NoisyViT is evaluated against state-of-the-art food recognition models. Our results demonstrate that NoisyViT achieves Top-1 accuracies of 95%, 99.5%, and 96.6% on Food2K, Food-101, and CNFOOD-241, respectively, significantly outperforming existing approaches. This study underscores the potential of NoisyViT for dietary assessment, nutritional monitoring, and healthcare applications, paving the way for future advancements in vision-based food computing. Code for reproducing NoisyViT for food recognition is available at NoisyViT_Food.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Cite as:	arXiv:2503.18997 [cs.CV]
	(or arXiv:2503.18997v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.18997

Submission history

From: Tonmoy Ghosh [view email]
[v1] Mon, 24 Mar 2025 03:03:00 UTC (1,894 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Food Image Recognition with Noisy Vision Transformer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Food Image Recognition with Noisy Vision Transformer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators