FlatNet: Towards Photorealistic Scene
Reconstruction from Lensless Measurements
IEEE TPAMI 2020

*denotes equal contribution

Abstract

overview

Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, the current iterative-optimization-based reconstruction algorithms produce noisier and perceptually poorer images. In this work, we propose a non-iterative deep learning-based reconstruction approach that results in orders of magnitude improvement in image quality for lensless reconstructions. Our approach, called FlatNet, lays down a framework for reconstructing high-quality photorealistic images from mask-based lensless cameras, where the camera’s forward model formulation is known.

Method


Method overview.
Hover mouse pointer to see details.


FlatNet consists of two stages: (1) an inversion stage that maps the measurement into a space of intermediate reconstruction by learning parameters within the forward model formulation, and (2) a perceptual enhancement stage that improves the perceptual quality of this intermediate reconstruction. These stages are trained together in an end-to-end manner.

We show high-quality reconstructions by performing extensive experiments on real and challenging scenes using two different types of lensless prototypes: FlatCam which uses a separable forward model and PhlatCam, which uses a more general non-separable cropped-convolution model. Our end-to-end approach is fast, produces photorealistic reconstructions, and is easy to adopt for other mask-based lensless cameras.

Spotlight Video at CVPR 2020



Presented at the CVPR Computational Cameras and Displays (CCD) Workshop 2020.

 [Slides]

FlatNet without Calibration


overview

Calibration of lensless cameras to obtain the Point Spread Function (PSF) can be a time consuming process and has to be done for each individual camera. Even a small error in calibration can lead to severe degradation in the performance of the reconstruction algorithm.

Since FlatNet employs a trainable inversion layer, it does not require careful calibration of the PSF. We provide a specific initialization scheme of the trainable layer for both separable and non-separable cases. Given the mask profile and camera geometry, one can still initialize the inversion layer. Please refer to our paper for more details.

Working with smaller sensors

overview

Another practical scenario that arises in non-separable lensless cameras is the finite size of sensors. Such cameras can be approximated by a convolutional model, and as a consequence, the sensor measurement is the weighted sum of various shifted PSFs. For a large PSF, the measurement can often exceed the sensor size, leading to lost information.

While FlatNet is based on the convolutional model, we show how it can be extended to robustly handle smaller sensors with a simple padding scheme. As seen in the figure above, our padding scheme greatly improves the intermediate reconstruction. Finally, our trainable inversion further alleviates visual artefacts, allowing FlatNet to recover scenes on smaller sensors without any significant performance degradation. Please refer to our paper for more details. overview

Lensless Imaging in Indoor Lighting


overview

While we trained FlatNet using a monitor-capture scheme, which allows us to inexpensively collect a large dataset, our objective is to recover scenes from real measurements captured in the wild. we finetune FlatNet using a real world dataset we captured called the Unconstrained Indoor Dataset.

This dataset consists of unaligned webcam and PhlatCam captures. As seen in the figure above, such a finetuning scheme results in more photorealistic reconstructions.

Citation


If you find our code or datasets useful in your research, please cite the following:




Related Links


FlatCam was introduced by Asif et al. (2015), using a separable amplitude mask to replace a conventional lens.

Antipa et al. (2017) further showed how off-the sheld diffusers and other random caustics (DiffuserCam) could be used as phase-mask in lensless cameras.

Boominathan et al. (2020) recently developed PhlatCam, which further improves light efficiency and reconstruction quality by using a transparent phase mask.

Our previous work, Khan et al. (2019) introduced FlatNet for separable lensless models, demonstrating how a learnable inversion layer can be coupled with a image enhancement model such as UNet.

Concurrently, Monakhova et al. (2019) proposed Le-ADMM which is a unrolled neural network architecture using learnable ADMM optimization steps.

Acknowledgements


This work was supported in part by NSF CAREER: IIS- 1652633, NSF EXPEDITIONS: CCF-1730574, DARPA NESD: HR0011-17-C0026, NIH Grant: R21EY029459 and the Qualcomm Innovation Fellowship India 2020.

This website template was borrowed from Michaël Gharbi and Matthew Tannick.