Communication under synchronization errors: coding and limitations

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

The aim of this research project is the study of coding schemes for, and information-theoretic limits of, communication models with synchronization errors. These are types of errors where symbols are deleted from, or inserted into, a communication stream. Such communication settings have practical applications in computational biology and data storage, and are related to important open questions in information theory and theoretical computer science. The student will develop novel techniques based on convex optimization to obtain upper bounds on the efficiency of communication schemes in such models, and will apply the theory of pseudorandomness from theoretical computer science to design novel coding schemes.

Theoretical Computing

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509486/1 01/10/2016 31/03/2022
1971087 Studentship EP/N509486/1 01/10/2017 31/03/2021 Joao Lourenco Ribeiro
 
Description New upper bounds on the maximum information transmission rate through communication channels with synchronization errors (deletions and replications of input symbols) have been derived. In particular, these new bounds improve upon the previous state-of-the-art bounds for some range of error rates, and require significantly less computing power to be computed. This also includes the first non-trivial upper bound on channels with input symbol replications only that requires no computer assistance, and thus can be completely checked by humans.

The techniques above have also been used to obtain significantly improved upper bounds on the maximum information transmission rate through a class of channels modelling optic communication.

The problem of transmitting information simultaneously through multiple channels with synchronization errors has also been considered. Remarkably, it is shown that if one allows computationally efficient low-redundancy coding of input strings, then it is possible to significantly reduce the number of channels required for exact reconstruction of the underlying string as compared to the uncoded case. This is a first step towards developing low-redundancy codes for DNA-based data storage with computationally efficient encoding and decoding procedures and mathematical guarantees.
Exploitation Route The outcomes of this funding present new ideas and techniques that open the door to the development of practical low-redundancy codes for DNA-based data storage, an emerging data storage technology. Our results also lead to a better understanding of the storage limitations of this technology through the upper bounds on information transmission rates of channels with synchronization errors.
Sectors Digital/Communication/Information Technologies (including Software)