iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks

Chadha, Aman; Britto, John; Roja, M. Mani

doi:10.1007/s41095-020-0175-7

Computer Science > Computer Vision and Pattern Recognition

arXiv:2006.11161 (cs)

[Submitted on 13 Jun 2020 (v1), last revised 30 Sep 2020 (this version, v4)]

Title:iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks

Authors:Aman Chadha, John Britto, M. Mani Roja

View PDF

Abstract:Recently, learning-based models have enhanced the performance of single-image super-resolution (SISR). However, applying SISR successively to each video frame leads to a lack of temporal coherency. Convolutional neural networks (CNNs) outperform traditional approaches in terms of image quality metrics such as peak signal to noise ratio (PSNR) and structural similarity (SSIM). However, generative adversarial networks (GANs) offer a competitive advantage by being able to mitigate the issue of a lack of finer texture details, usually seen with CNNs when super-resolving at large upscaling factors. We present iSeeBetter, a novel GAN-based spatio-temporal approach to video super-resolution (VSR) that renders temporally consistent super-resolution videos. iSeeBetter extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator. Furthermore, to improve the "naturality" of the super-resolved image while eliminating artifacts seen with traditional algorithms, we utilize the discriminator from super-resolution generative adversarial network (SRGAN). Although mean squared error (MSE) as a primary loss-minimization objective improves PSNR/SSIM, these metrics may not capture fine details in the image resulting in misrepresentation of perceptual quality. To address this, we use a four-fold (MSE, perceptual, adversarial, and total-variation (TV)) loss function. Our results demonstrate that iSeeBetter offers superior VSR fidelity and surpasses state-of-the-art performance.

Comments:	11 pages, 6 figures, 4 tables, Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
Cite as:	arXiv:2006.11161 [cs.CV]
	(or arXiv:2006.11161v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2006.11161
Journal reference:	Springer Journal of Computational Visual Media, Tsinghua University Press, 6(3):307-317, 2020
Related DOI:	https://doi.org/10.1007/s41095-020-0175-7

Submission history

From: Aman Chadha Mr. [view email]
[v1] Sat, 13 Jun 2020 01:36:30 UTC (2,621 KB)
[v2] Mon, 22 Jun 2020 20:34:00 UTC (2,619 KB)
[v3] Sat, 29 Aug 2020 21:38:05 UTC (2,621 KB)
[v4] Wed, 30 Sep 2020 00:45:38 UTC (2,621 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators