DeepV2D: Video to Depth with Differentiable Structure from Motion

arXiv preprint, April 2019

Zachary Teed, Jia Deng

We propose DeepV2D, an end-to-end deep learning architecture for predicting depth from video. DeepV2D combines the representation ability of neural networks with the geometric principles governing image formation. We compose a collection of classical geometric algorithms, which are converted into trainable modules and combined into an end-to-end differentiable architecture. DeepV2D interleaves two stages: camera motion estimation and depth estimation. During inference, motion and depth estimation are alternated and quickly converge to accurate depth. Code is available https://github.com/princeton-vl/DeepV2D.

Zachary Teed and Jia Deng.
"DeepV2D: Video to Depth with Differentiable Structure from Motion."
arXiv:1812.04605, April 2019.

@techreport{Teed:2019:DVT,
   author = "Zachary Teed and Jia Deng",
   title = "{DeepV2D}: Video to Depth with Differentiable Structure from Motion",
   institution = "arXiv preprint",
   year = "2019",
   month = apr,
   number = "1812.04605"
}