ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems
European Conference on Computer Vision (ECCV) oral presentation, September 2018
Abstract
In this paper we present ActiveStereoNet, the first deep
learning solution for active stereo systems. Due to the lack of ground
truth, our method is fully self-supervised, yet it produces precise depth
with a subpixel precision of 1/30th
of a pixel; it does not suffer from the
common over-smoothing issues; it preserves the edges; and it explicitly
handles occlusions. We introduce a novel reconstruction loss that is more
robust to noise and texture-less patches, and is invariant to illumination
changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation
is edge-preserving and smooths the loss function, which is key to allow
the network to reach compelling results. Finally we show how the task
of predicting invalid regions, such as occlusions, can be trained
end-to-end without ground-truth. This component is crucial to reduce blur and
particularly improves predictions along depth discontinuities. Extensive
quantitatively and qualitatively evaluations on real and synthetic data
demonstrate state of the art results in many challenging scenes.
Links
- Paper page on arXiv
Citation
Yinda Zhang, Sameh Khamis, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Michael Schoenberg, Shahram Izadi, Thomas Funkhouser, and Sean Fanello.
"ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems."
European Conference on Computer Vision (ECCV) oral presentation, September 2018.
BibTeX
@inproceedings{Zhang:2018:AES, author = "Yinda Zhang and Sameh Khamis and Christoph Rhemann and Julien Valentin and Adarsh Kowdle and Vladimir Tankovich and Michael Schoenberg and Shahram Izadi and Thomas Funkhouser and Sean Fanello", title = "{ActiveStereoNet}: End-to-End Self-Supervised Learning for Active Stereo Systems", booktitle = "European Conference on Computer Vision (ECCV) oral presentation", year = "2018", month = sep }