Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

Somayazulu, Arjun; Mavroudi, Efi; Chen, Changan; Torresani, Lorenzo; Grauman, Kristen

Computer Science > Computer Vision and Pattern Recognition

arXiv:2504.05451 (cs)

[Submitted on 7 Apr 2025]

Title:Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

Authors:Arjun Somayazulu, Efi Mavroudi, Changan Chen, Lorenzo Torresani, Kristen Grauman

View PDF HTML (experimental)

Abstract:Traditional methods for view-invariant learning from video rely on controlled multi-view settings with minimal scene clutter. However, they struggle with in-the-wild videos that exhibit extreme viewpoint differences and share little visual content. We introduce a method for learning rich video representations in the presence of such severe view-occlusions. We first define a geometry-based metric that ranks views at a fine-grained temporal scale by their likely occlusion level. Then, using those rankings, we formulate a knowledge distillation objective that preserves action-centric semantics with a novel curriculum learning procedure that pairs incrementally more challenging views over time, thereby allowing smooth adaptation to extreme viewpoint differences. We evaluate our approach on two tasks, outperforming SOTA models on both temporal keystep grounding and fine-grained keystep recognition benchmarks - particularly on views that exhibit severe occlusion.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2504.05451 [cs.CV]
	(or arXiv:2504.05451v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2504.05451

Submission history

From: Arjun Somayazulu [view email]
[v1] Mon, 7 Apr 2025 19:30:30 UTC (17,068 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Activity View-invariance Under Extreme Viewpoint Changes via Curriculum Knowledge Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators