MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions

Wu, Haixu; Yao, Zhiyu; Wang, Jianmin; Long, Mingsheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2103.02243 (cs)

[Submitted on 3 Mar 2021 (v1), last revised 8 Jul 2021 (this version, v3)]

Title:MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions

Authors:Haixu Wu, Zhiyu Yao, Jianmin Wang, Mingsheng Long

View PDF

Abstract:This paper tackles video prediction from a new dimension of predicting spacetime-varying motions that are incessantly changing across both space and time. Prior methods mainly capture the temporal state transitions but overlook the complex spatiotemporal variations of the motion itself, making them difficult to adapt to ever-changing motions. We observe that physical world motions can be decomposed into transient variation and motion trend, while the latter can be regarded as the accumulation of previous motions. Thus, simultaneously capturing the transient variation and the motion trend is the key to make spacetime-varying motions more predictable. Based on these observations, we propose the MotionRNN framework, which can capture the complex variations within motions and adapt to spacetime-varying scenarios. MotionRNN has two main contributions. The first is that we design the MotionGRU unit, which can model the transient variation and motion trend in a unified way. The second is that we apply the MotionGRU to RNN-based predictive models and indicate a new flexible video prediction architecture with a Motion Highway that can significantly improve the ability to predict changeable motions and avoid motion vanishing for stacked multiple-layer predictive models. With high flexibility, this framework can adapt to a series of models for deterministic spatiotemporal prediction. Our MotionRNN can yield significant improvements on three challenging benchmarks for video prediction with spacetime-varying motions.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2103.02243 [cs.CV]
	(or arXiv:2103.02243v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2103.02243

Submission history

From: Zhiyu Yao [view email]
[v1] Wed, 3 Mar 2021 08:11:50 UTC (15,978 KB)
[v2] Thu, 4 Mar 2021 08:55:59 UTC (15,978 KB)
[v3] Thu, 8 Jul 2021 01:42:51 UTC (16,421 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators