FlatNet: Towards Photorealistic Scene
Reconstruction from Lensless Measurements
IEEE TPAMI 2020
- Salman S. Khan* IIT Madras
- Varun Sundar* IIT Madras
- Vivek Boominathan Rice University
- Ashok Veeraraghavan Rice University
- Kaushik Mitra IIT Madras
Abstract
Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras
by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless
cameras rely on computational algorithms to recover the scenes from multiplexed measurements.
However, the current iterative-optimization-based reconstruction algorithms produce noisier and
perceptually poorer images. In this work, we propose a non-iterative deep learning-based
reconstruction approach that
results in orders of magnitude improvement in image quality for lensless reconstructions. Our
approach, called FlatNet, lays down a framework for reconstructing high-quality
photorealistic
images from mask-based lensless cameras, where the camera’s forward model formulation is known.
Please reach out to me at salmansiddique.khan@gmail.com in case you run into any issues with the datasets/code.
Method
FlatNet consists of two stages: (1) an inversion stage that maps the measurement
into a
space of
intermediate reconstruction by learning parameters within the forward model formulation, and (2)
a perceptual enhancement stage that improves the perceptual quality of this intermediate
reconstruction. These stages are trained together in an end-to-end manner.
We show high-quality
reconstructions by performing extensive experiments on real and challenging scenes using two
different types of lensless prototypes: FlatCam
which uses a separable forward model and
PhlatCam,
which uses a more general non-separable cropped-convolution model. Our end-to-end approach is
fast, produces photorealistic reconstructions, and is easy to adopt for other mask-based
lensless cameras.
Spotlight Video at CVPR 2020
Presented at the CVPR Computational Cameras and Displays (CCD) Workshop 2020.
FlatNet without Calibration
Calibration of lensless cameras to obtain the Point Spread Function (PSF) can be a time consuming process and has to be done for each individual camera. Even a small error in calibration can lead to severe degradation in the performance of the reconstruction algorithm.
Since FlatNet employs a trainable inversion layer, it does not require careful calibration of the PSF. We provide a specific initialization scheme of the trainable layer for both separable and non-separable cases. Given the mask profile and camera geometry, one can still initialize the inversion layer. Please refer to our paper for more details.
Working with smaller sensors
Another practical scenario that arises in non-separable lensless cameras is the finite size of sensors. Such cameras can be approximated by a convolutional model, and as a consequence, the sensor measurement is the weighted sum of various shifted PSFs. For a large PSF, the measurement can often exceed the sensor size, leading to lost information.
While FlatNet is based on the convolutional model, we show how it can be extended to robustly handle smaller sensors with a simple padding scheme. As seen in the figure above, our padding scheme greatly improves the intermediate reconstruction. Finally, our trainable inversion further alleviates visual artefacts, allowing FlatNet to recover scenes on smaller sensors without any significant performance degradation. Please refer to our paper for more details.
Lensless Imaging in Indoor Lighting
While we trained FlatNet using a monitor-capture scheme, which allows us to inexpensively collect a large dataset, our objective is to recover scenes from real measurements captured in the wild. we finetune FlatNet using a real world dataset we captured called the Unconstrained Indoor Dataset.
This dataset consists of unaligned webcam and PhlatCam captures. As seen in the figure above, such a finetuning scheme results in more photorealistic reconstructions.
Citation
If you find our code or datasets useful in your research, please cite the following:
Related Links
FlatCam was introduced by Asif et al.
(2015), using a separable amplitude
mask to replace a conventional lens.
Antipa et al. (2017) further showed how off-the
sheld diffusers and
other random caustics (DiffuserCam) could be used as phase-mask in lensless cameras.
Boominathan et al. (2020) recently
developed PhlatCam, which further
improves light efficiency and reconstruction quality by using a transparent phase mask.
Our previous work, Khan
et al. (2019) introduced FlatNet for separable
lensless models, demonstrating how a learnable inversion layer can be coupled with a image
enhancement model such as UNet.
Concurrently, Monakhova et al. (2019)
proposed Le-ADMM which is a unrolled
neural network architecture using learnable ADMM optimization steps.
Acknowledgements
This work was supported in part by NSF CAREER: IIS- 1652633, NSF EXPEDITIONS: CCF-1730574,
DARPA NESD: HR0011-17-C0026, NIH Grant: R21EY029459 and the Qualcomm Innovation Fellowship India 2020.
This website template was borrowed from Michaël Gharbi and
Matthew Tannick.