Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows

Posso, Julien; Kieffer, Hugo; Menga, Nicolas; Hlimi, Omar; Tarris, Sébastien; Guerard, Hubert; Bois, Guy; Couderc, Matthieu; Jenn, Eric

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.08700 (cs)

[Submitted on 7 Mar 2025]

Title:Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows

Authors:Julien Posso, Hugo Kieffer, Nicolas Menga, Omar Hlimi, Sébastien Tarris, Hubert Guerard, Guy Bois, Matthieu Couderc, Eric Jenn

View PDF

Abstract:This study introduces a lightweight U-Net model optimized for real-time semantic segmentation of aerial images, targeting the efficient utilization of Commercial Off-The-Shelf (COTS) embedded computing platforms. We maintain the accuracy of the U-Net on a real-world dataset while significantly reducing the model's parameters and Multiply-Accumulate (MAC) operations by a factor of 16. Our comprehensive analysis covers three hardware platforms (CPU, GPU, and FPGA) and five different toolchains (TVM, FINN, Vitis AI, TensorFlow GPU, and cuDNN), assessing each on metrics such as latency, power consumption, memory footprint, energy efficiency, and FPGA resource usage. The results highlight the trade-offs between these platforms and toolchains, with a particular focus on the practical deployment challenges in real-world applications. Our findings demonstrate that while the FPGA with Vitis AI emerges as the superior choice due to its performance, energy efficiency, and maturity, it requires specialized hardware knowledge, emphasizing the need for a balanced approach in selecting embedded computing solutions for semantic segmentation tasks

Comments:	ERTS2024, Jun 2024, Toulouse, France
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR); Machine Learning (cs.LG)
Cite as:	arXiv:2503.08700 [cs.CV]
	(or arXiv:2503.08700v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.08700

Submission history

From: Julien Posso [view email] [via CCSD proxy]
[v1] Fri, 7 Mar 2025 08:33:28 UTC (1,130 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators