ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes 
   IEEE Conference on Computer Vision and Pattern Recognition (CVPR) spotlight presentation, July 2017 
  Abstract
A key requirement for leveraging supervised deep learning methods is the
availability of large, labeled datasets. Unfortunately, in the context of RGB-D
scene understanding, very little data is available -- current datasets cover a
small range of scene views and have limited semantic annotations. To address
this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views
in 1513 scenes annotated with 3D camera poses, surface reconstructions, and
semantic segmentations. To collect this data, we designed an easy-to-use and
scalable RGB-D capture system that includes automated surface reconstruction
and crowdsourced semantic annotation. We show that using this data helps
achieve state-of-the-art performance on several 3D scene understanding tasks,
including 3D object classification, semantic voxel labeling, and CAD model
retrieval. The dataset is freely available at http://www.scan-net.org.
Citation
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and  Matthias Nießner.
"ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes."
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) spotlight presentation, July 2017.
BibTeX
@inproceedings{Dai:2017:SR3,
   author = "Angela Dai and Angel X. Chang and Manolis Savva and Maciej Halber and
      Thomas Funkhouser and Matthias Nie{\ss}ner",
   title = "{ScanNet}: Richly-annotated {3D} Reconstructions of Indoor Scenes",
   booktitle = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
      spotlight presentation",
   year = "2017",
   month = jul
}