CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection

Zhang, Xiang; Fu, Chenchen; Cui, Yufei; Yi, Lan; Sun, Yuyang; Wu, Weiwei; Liu, Xue

Abstract:Real-time object detection takes an essential part in the decision-making process of numerous real-world applications, including collision avoidance and path planning in autonomous driving systems. This paper presents a novel real-time streaming perception method named CorrDiff, designed to tackle the challenge of delays in real-time detection systems. The main contribution of CorrDiff lies in its adaptive delay-aware detector, which is able to utilize runtime-estimated temporal cues to predict objects' locations for multiple future frames, and selectively produce predictions that matches real-world time, effectively compensating for any communication and computational delays. The proposed model outperforms current state-of-the-art methods by leveraging motion estimation and feature enhancement, both for 1) single-frame detection for the current frame or the next frame, in terms of the metric mAP, and 2) the prediction for (multiple) future frame(s), in terms of the metric sAP (The sAP metric is to evaluate object detection algorithms in streaming scenarios, factoring in both latency and accuracy). It demonstrates robust performance across a range of devices, from powerful Tesla V100 to modest RTX 2080Ti, achieving the highest level of perceptual accuracy on all platforms. Unlike most state-of-the-art methods that struggle to complete computation within a single frame on less powerful devices, CorrDiff meets the stringent real-time processing requirements on all kinds of devices. The experimental results emphasize the system's adaptability and its potential to significantly improve the safety and reliability for many real-world systems, such as autonomous driving. Our code is completely open-sourced and is available at this https URL.

Comments:	Submitted to IEEE JSAC Special Issue: Intelligent Communications for Real-Time Computer Vision (Comm4CV)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.05132 [cs.CV]
	(or arXiv:2501.05132v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.05132

Computer Science > Computer Vision and Pattern Recognition

Title:CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators