Automated Classical-to-Quantum Data Encoding for Genomics Data

Lead Participant: HOMOMORPHIC LABS LTD

Abstract

Biomedical data is an essential resource for developing machine learning models that can aid in diagnosis, treatment, and prevention of diseases. However, the collection, storage, and sharing of biomedical data presents significant challenges due to their sensitive nature and ethical considerations. Healthcare data is subject to strict regulations and privacy laws, making it challenging for researchers to access and share data. Moreover, machine learning models trained on a single dataset tend to overfit, and may not generalise well to new data, which limits their potential use in real-world applications.

To address these challenges, we must explore new approaches to biomedical machine learning that can leverage large and diverse datasets, whilst also ensuring data privacy and security. One promising novel solution is to combine the power of quantum computing, with the benefits of federated learning (FL), namely, hybrid classical-quantum federated learning. This distributed quantum learning approach enables organisations to train hybrid quantum machine learning models on their respective classical datasets, without sharing raw data. In federated learning, the machine learning model training is distributed to individual devices or servers with access to quantum processing units (QPUs) to run the quantum part, which then trains the model on their respective datasets.

Unfortunately, classical datasets cannot directly be loaded into a quantum computer for processing, they need to be encoded into a form that a quantum computer can understand beforehand. In essence, classical-to-quantum data encoding is the process of converting classical data into quantum states for further usage in a quantum algorithm. However, due to the limitations on the number of qubits, due to the current generation of quantum hardware a.k.a Noisy-Intermediary Scale Quantum (NISQ) hardware, existing general data encoding methods are not always fit for purpose, especially when it comes to datasets such as DNA sequences. For example, we've recently tested a popular encoding scheme known as 'Amplitude encoding' on DNA sequences and found the following shortcomings: it is highly-sensitivity to input data, as small changes in the DNA sequence can result in large changes in the output.

Our idea is to develop an efficient and automated classical-to-quantum data encoding software-as-a-service (SaaS) toolkit codenamed NZ-SeQTech. One that is optimised for hybrid genomics data and federated quantum learning use cases. The aim is for NZ-SeQTech to provide; researchers, students, professionals, and enthusiasts working in Genomics with an easy-to-use and accessible data encoding tool for training hybrid classical-quantum models in a federated setup.

Lead Participant

Project Cost

Grant Offer

HOMOMORPHIC LABS LTD £49,966 £ 49,966
 

Participant

UNIVERSITY OF LIVERPOOL

Publications

10 25 50