Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

Ghoddoosian, Reza; Dwivedi, Isht; Agarwal, Nakul; Choi, Chiho; Dariush, Behzad

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.13309 (cs)

[Submitted on 24 Mar 2022]

Title:Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

Authors:Reza Ghoddoosian, Isht Dwivedi, Nakul Agarwal, Chiho Choi, Behzad Dariush

View PDF

Abstract:This paper addresses a new problem of weakly-supervised online action segmentation in instructional videos. We present a framework to segment streaming videos online at test time using Dynamic Programming and show its advantages over greedy sliding window approach. We improve our framework by introducing the Online-Offline Discrepancy Loss (OODL) to encourage the segmentation results to have a higher temporal consistency. Furthermore, only during training, we exploit frame-wise correspondence between multiple views as supervision for training weakly-labeled instructional videos. In particular, we investigate three different multi-view inference techniques to generate more accurate frame-wise pseudo ground-truth with no additional annotation cost. We present results and ablation studies on two benchmark multi-view datasets, Breakfast and IKEA ASM. Experimental results show efficacy of the proposed methods both qualitatively and quantitatively in two domains of cooking and assembly.

Comments:	Accepted CVPR 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.13309 [cs.CV]
	(or arXiv:2203.13309v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.13309

Submission history

From: Reza Ghoddoosian [view email]
[v1] Thu, 24 Mar 2022 19:27:56 UTC (8,612 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators