TY - GEN
T1 - Scene structure inference through scene map estimation
AU - Hueting, Moos
AU - Pǎtrǎucean, Viorica
AU - Ovsjanikov, Maks
AU - Mitra, Niloy J.
N1 - Publisher Copyright:
© 2016 The Author(s) Eurographics Proceedings © 2016 The Eurographics Association.
PY - 2016
Y1 - 2016
N2 - Understanding indoor scene structure from a single RGB image is useful for a wide variety of applications ranging from the editing of scenes to the mining of statistics about space utilization. Most efforts in scene understanding focus on extraction of either dense information such as pixel-level depth or semantic labels, or very sparse information such as bounding boxes obtained through object detection. In this paper we propose the concept of a scene map, a coarse scene representation, which describes the locations of the objects present in the scene from a top-down view (i.e., as they are positioned on the floor), as well as a pipeline to extract such a map from a single RGB image. To this end, we use a synthetic rendering pipeline, which supplies an adapted CNN with virtually unlimited training data. We quantitatively evaluate our results, showing that we clearly outperform a dense baseline approach, and argue that scene maps provide a useful representation for abstract indoor scene understanding.
AB - Understanding indoor scene structure from a single RGB image is useful for a wide variety of applications ranging from the editing of scenes to the mining of statistics about space utilization. Most efforts in scene understanding focus on extraction of either dense information such as pixel-level depth or semantic labels, or very sparse information such as bounding boxes obtained through object detection. In this paper we propose the concept of a scene map, a coarse scene representation, which describes the locations of the objects present in the scene from a top-down view (i.e., as they are positioned on the floor), as well as a pipeline to extract such a map from a single RGB image. To this end, we use a synthetic rendering pipeline, which supplies an adapted CNN with virtually unlimited training data. We quantitatively evaluate our results, showing that we clearly outperform a dense baseline approach, and argue that scene maps provide a useful representation for abstract indoor scene understanding.
UR - http://www.scopus.com/inward/record.url?scp=85019629086&partnerID=8YFLogxK
U2 - 10.2312/vmv.20161341
DO - 10.2312/vmv.20161341
M3 - Conference contribution
AN - SCOPUS:85019629086
T3 - VMV 2016 - Vision, Modeling and Visualization
SP - 45
EP - 52
BT - VMV 2016 - Vision, Modeling and Visualization
A2 - Fellner, Dieter
PB - Eurographics Association
T2 - 21st International Symposium on Vision, Modeling and Visualization, VMV 2016
Y2 - 10 October 2016 through 12 October 2016
ER -