TY - GEN
T1 - Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging
AU - Chugunov, Ilya
AU - Baek, Seung-Hwan
AU - Fu, Qiang
AU - Heidrich, Wolfgang
AU - Heide, Felix
N1 - KAUST Repository Item: Exported on 2022-04-01
Acknowledgements: This work was supported in part by KAUST baseline funding, Felix Heide's NSF CAREER Award (2047359) and Sony Faculty Innovation Award, and Ilya Chugunov's NSF Graduate Research Fellowship.
PY - 2021/6
Y1 - 2021/6
N2 - We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. FPs are pervasive artifacts which occur around depth edges, where light paths from both an object and its background are integrated over the aperture. This light mixes at a sensor pixel to produce erroneous depth estimates, which can adversely affect downstream 3D vision tasks. Mask-ToF starts at the source of these FPs, learning a microlens-level occlusion mask which effectively creates a custom-shaped sub-aperture for each sensor pixel. This modulates the selection of foreground and background light mixtures on a per-pixel basis and thereby encodes scene geometric information directly into the ToF measurements. We develop a differentiable ToF simulator to jointly train a convolutional neural network to decode this information and produce high-fidelity, low-FP depth reconstructions. We test the effectiveness of Mask-ToF on a simulated light field dataset and validate the method with an experimental prototype. To this end, we manufacture the learned amplitude mask and design an optical relay system to virtually place it on a high-resolution ToF sensor. We find that Mask-ToF generalizes well to real data without retraining, cutting FP counts in half.
AB - We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. FPs are pervasive artifacts which occur around depth edges, where light paths from both an object and its background are integrated over the aperture. This light mixes at a sensor pixel to produce erroneous depth estimates, which can adversely affect downstream 3D vision tasks. Mask-ToF starts at the source of these FPs, learning a microlens-level occlusion mask which effectively creates a custom-shaped sub-aperture for each sensor pixel. This modulates the selection of foreground and background light mixtures on a per-pixel basis and thereby encodes scene geometric information directly into the ToF measurements. We develop a differentiable ToF simulator to jointly train a convolutional neural network to decode this information and produce high-fidelity, low-FP depth reconstructions. We test the effectiveness of Mask-ToF on a simulated light field dataset and validate the method with an experimental prototype. To this end, we manufacture the learned amplitude mask and design an optical relay system to virtually place it on a high-resolution ToF sensor. We find that Mask-ToF generalizes well to real data without retraining, cutting FP counts in half.
UR - http://hdl.handle.net/10754/668514
UR - https://ieeexplore.ieee.org/document/9578501/
UR - http://www.scopus.com/inward/record.url?scp=85123221826&partnerID=8YFLogxK
U2 - 10.1109/cvpr46437.2021.00900
DO - 10.1109/cvpr46437.2021.00900
M3 - Conference contribution
SN - 9781665445092
SP - 9112
EP - 9122
BT - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
PB - IEEE
ER -