TY - GEN
T1 - DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama
AU - Yang, Shang-Ta
AU - Wang, Fu-En
AU - Peng, Chi-Han
AU - Wonka, Peter
AU - Sun, Min
AU - Chu, Hung-Kuo
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): URF/1/3426-01-01
Acknowledgements: The project was funded in part by the KAUST Office of Sponsored Research (OSR) under Award No. URF/1/3426-01-01, and the Ministry of Science and Technology of Taiwan (107-2218-E-007-047- and 107-2221-E-007-088-MY3)
PY - 2019
Y1 - 2019
N2 - We present a deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama. To achieve better prediction accuracy, our method leverages two projections of the panorama at once, namely the equirectangular panorama-view and the perspective ceiling-view, that each contains different clues about the room layouts. Our network architecture consists of two encoder-decoder branches for analyzing each of the two views. In addition, a novel feature fusion structure is proposed to connect the two branches, which are then jointly trained to predict the 2D floor plans and layout heights. To learn more complex room layouts, we introduce the Realtor360 dataset that contains panoramas of Manhattan-world room layouts with different numbers of corners. Experimental results show that our work outperforms recent state-of-the-art in prediction accuracy and performance, especially in the rooms with non-cuboid layouts.
AB - We present a deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama. To achieve better prediction accuracy, our method leverages two projections of the panorama at once, namely the equirectangular panorama-view and the perspective ceiling-view, that each contains different clues about the room layouts. Our network architecture consists of two encoder-decoder branches for analyzing each of the two views. In addition, a novel feature fusion structure is proposed to connect the two branches, which are then jointly trained to predict the 2D floor plans and layout heights. To learn more complex room layouts, we introduce the Realtor360 dataset that contains panoramas of Manhattan-world room layouts with different numbers of corners. Experimental results show that our work outperforms recent state-of-the-art in prediction accuracy and performance, especially in the rooms with non-cuboid layouts.
UR - http://hdl.handle.net/10754/660305
UR - https://ieeexplore.ieee.org/document/8953219/
UR - http://www.scopus.com/inward/record.url?scp=85078785348&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2019.00348
DO - 10.1109/CVPR.2019.00348
M3 - Conference contribution
SN - 9781728132938
SP - 3358
EP - 3367
BT - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
PB - IEEE
ER -