Video Exploration via Video-Specific Autoencoders

Wang, Kevin; Ramanan, Deva; Bansal, Aayush

Computer Science > Computer Vision and Pattern Recognition

arXiv:2103.17261v1 (cs)

[Submitted on 31 Mar 2021 (this version), latest version 7 Jan 2022 (v2)]

Title:Video Exploration via Video-Specific Autoencoders

Authors:Kevin Wang, Deva Ramanan, Aayush Bansal

View PDF

Abstract:We present simple video-specific autoencoders that enables human-controllable video exploration. This includes a wide variety of analytic tasks such as (but not limited to) spatial and temporal super-resolution, spatial and temporal editing, object removal, video textures, average video exploration, and correspondence estimation within and across videos. Prior work has independently looked at each of these problems and proposed different formulations. In this work, we observe that a simple autoencoder trained (from scratch) on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks. Our tasks are enabled by two key observations: (1) latent codes learned by the autoencoder capture spatial and temporal properties of that video and (2) autoencoders can project out-of-sample inputs onto the video-specific manifold. For e.g. (1) interpolating latent codes enables temporal super-resolution and user-controllable video textures; (2) manifold reprojection enables spatial super-resolution, object removal, and denoising without training for any of the tasks. Importantly, a two-dimensional visualization of latent codes via principal component analysis acts as a tool for users to both visualize and intuitively control video edits. Finally, we quantitatively contrast our approach with the prior art and found that without any supervision and task-specific knowledge, our approach can perform comparably to supervised approaches specifically trained for a task.

Comments:	Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Cite as:	arXiv:2103.17261 [cs.CV]
	(or arXiv:2103.17261v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2103.17261

Submission history

From: Aayush Bansal [view email]
[v1] Wed, 31 Mar 2021 17:56:13 UTC (35,631 KB)
[v2] Fri, 7 Jan 2022 20:00:33 UTC (42,178 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Video Exploration via Video-Specific Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Video Exploration via Video-Specific Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators