Lossy Light Field Compression Using Modern Deep Learning and Domain Randomization Techniques
Svetozar Zarko Valtchev
York University, Toronto
[Paper]
[Dataset]
[Slides]
[Bibtex]

Ours Ours + CNE True
Ours Ours + CNE True
Examples of the full and zoomed-in Origami scene from the HCI Light Field Dataset reconstructed using our method, with and without the addition of the Convolutional Neural Enhancer (CNE), at a bitrate of 0.2bpp.



Abstract

Lossy data compression is a particular type of informational encoding utilizing approximations in order to efficiently tradeoff accuracy in favour of smaller file sizes. The transmission and storage of images is a typical example of this in the modern digital world. However the reconstructed images often suffer from degradation and display observable visual artifacts. Convolutional Neural Networks (CNNs) have garnered much attention in all corners of Computer Vision (CV), including the tasks of image compression and artifact reduction. We study how lossy compression can be extended to higher dimensional images with varying viewpoints, known as light fields. Domain Randomization (DR) is explored in detail, and used to generate the largest light field dataset we are aware of, to be used as training data. We formulate the task of compression under the frameworks of neural networks and calculate a quantization tensor for the 4-D Discrete Cosine Transform (DCT) coefficients of the light fields. In order to accurately train the network, a high degree approximation to the rounding operation is introduced. In addition, we present a multi-resolution convolutional-based light field enhancer, producing average gains of 0.854 db in Peak Signal-to-Noise Ratio (PSNR), and 0.0338 in Structual Similarity Index Measure (SSIM) over the base model, across a wide range of bitrates.


Talk


[Slides]


LIAM LF Dataset

We produce 20,000 synthetic light fields, at a 512x512 spatial resolution and a 9x9 angular resolution, as can be seen in the examples below. The dataset also includes depth maps and segmentation maps for the central Subaperture Image of each light field, as well as camera intrinsics data. A small part of the dataset is publically available on Kaggle, while the full dataset is available on request. Please cite the publication if you use the LIAM-LF-Dataset in your research.


[Dataset]


More Results

More examples from the HCI Light Field Dataset reconstructed using the Convolutional Neural Enhancer (CNE), at a bitrate of 0.2bpp. High frequency details appear to be maintained well such as the leaves in the plants and the text, but banding artifacts are also presents in some of the scenes.


Paper and Supplementary Material

S.Z. Valtchev.
Lossy Light Field Compression Using Modern Deep Learning and Domain Randomization Techniques.
YorkSpace Institutional Repository, York University, 2022.


[Bibtex]



Acknowledgements

This work has been supported by the Natural Sciences and Engineering Research Council of Canada, and by the Canada Research Chairs program.