Generating realistic Multiomic data
Lead Research Organisation:
UNIVERSITY OF CAMBRIDGE
Department Name: Computer Science and Technology
Abstract
The problem of reverse engineering gene regulatory networks from high-throughput expression data is one of the biggest
challenges in bioinformatics. In order to benchmark network inference algorithms, simulators of well-characterized
expression datasets are often required. However, existing simulators have been criticized because they fail to emulate
key properties of gene expression data (Maier et al., 2013). In my research project I aim to address two problems. First, I
wish to study and propose mechanisms to faithfully assess the realism of a synthetic expression dataset. Second, I wish
to design an adversarial simulator of expression data based on a generative adversarial network (GAN; Goodfellow et
al., 2014). This framework describes a method for estimating a generative model by playing a two-player game, in which
the first player learns to generate samples from a particular distribution, and the second tries to discriminate them from
the samples coming from the true data distribution. This novel deep learning framework has shown promising results for
tasks such as image or audio generation, and to the best of my knowledge GANs have not yet been applied to build a
simulator of gene expression data.
challenges in bioinformatics. In order to benchmark network inference algorithms, simulators of well-characterized
expression datasets are often required. However, existing simulators have been criticized because they fail to emulate
key properties of gene expression data (Maier et al., 2013). In my research project I aim to address two problems. First, I
wish to study and propose mechanisms to faithfully assess the realism of a synthetic expression dataset. Second, I wish
to design an adversarial simulator of expression data based on a generative adversarial network (GAN; Goodfellow et
al., 2014). This framework describes a method for estimating a generative model by playing a two-player game, in which
the first player learns to generate samples from a particular distribution, and the second tries to discriminate them from
the samples coming from the true data distribution. This novel deep learning framework has shown promising results for
tasks such as image or audio generation, and to the best of my knowledge GANs have not yet been applied to build a
simulator of gene expression data.
Organisations
People |
ORCID iD |
Pietro Lio (Primary Supervisor) | |
Ramon Viñas (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/R513180/1 | 30/09/2018 | 29/09/2023 | |||
2276380 | Studentship | EP/R513180/1 | 30/09/2019 | 29/09/2022 | Ramon Viñas |