Data compression in the presence of side information and sparsity

Lead Research Organisation: University of Cambridge
Department Name: Engineering

Abstract

It has been broadly recognised in recent years that many important types of data - including images, audio, video, medical data, neuroscience data, and graphs representing communication, social or biological networks - naturally admit sparse representations. Also, in many practical applications the data to be compressed comes in groups of two or more correlated streams.
We will initiate the study of developing general information-theoretic frameworks for these two problems: The explicit identification of the fundamental limits for storage and statistical analysis of sparse data, and the design and application of novel algorithmic methods for the efficient and effective compression of data in the presence of useful side information.
Our main aims regarding sparse data are to understand and derive: (i) The theoretical basis describing the fundamentally best achievable performance for sparse data compression and analysis; (ii) Methodological guidelines for the design of practical compression algorithms for sparse data; (iii) Effective general-purpose strategies as well as application-specific algorithms for the compression of long-memory sources in the presence of correlated side information.
For sparse sources, the proposed treatment closely parallels the classical information-theoretic development, but it also presents significant challenges as it requires different mathematical and statistical tools. In particular, the classical and more recent asymptotic methods based on Gaussian models do not apply for sparse data. New techniques from convex optimization and Poisson approximation need to be brought into this area.
For compression with side information, in order to establish optimality properties for relevant source coding algorithms, the techniques used for the analogous single-source results do not suffice. We need to employ new probabilistic tools and methods from ergodic theory.
Preliminary results in the case of sparse data with realistic sparsity characteristics are very encouraging. We believe these should lead to very precise achievability and converse compression-coding theorems, which will be proved using a combination of classical probabilistic and combinatorial techniques, and modern Poisson approximation and optimization methods.
And in the case of compression with side information, we believe that our preliminary probabilistic results will likely lead to efficient and easy to implement estimators for the conditional entropy, as well as to novel universal compression algorithms.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513180/1 01/10/2018 30/09/2023
2107993 Studentship EP/R513180/1 01/10/2018 30/09/2021 Lampros Gavalakis