An intelligent approach to the automatic characterisation and design of synthetic promoters in mammalian cells

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Engineering


Synthetic Biology (SynBio) is an emerging engineering discipline with an ambitious goal: empowering scientists with the ability to programme new functions into cells, just like we would do with computers. Despite a thriving community and notable successes, however, writing "functioning algorithms" for cells remains extremely time-consuming. This is a roadblock towards the engineering of mammalian cells, an area uniquely positioned to develop potentially groundbreaking therapeutic applications. This translates into high development costs that, in turn, are limiting the pace at which Synthetic Biology progresses towards applications. Model-Based System Engineering (MBSE) is the answer the engineering community found to similar problems and is widely used to streamline manufacturing. In this framework, mathematical models are used to screen candidate designs via simulations and bring to testing only the most promising solutions.

Despite being an engineering discipline, SynBio lacks a MBSE framework. This is largely due to three connected issues: (a) the scarcity of accurate mathematical models of parts (e.g. promoters) in the first place. Such a shortage (b) makes it difficult to "reverse engineer" the connection between the DNA sequence and the kinetics of the transcribed mRNA (e.g. promoter sequence and leakiness of expresion). This means that (c) the inverse "re-design" problem, i.e. finding the optimal DNA sequence of a part, cannot be solved, let alone automatically.

With this fellowship, I aim at filling this gap and develop a "Model-Based Biosystem Engineering" (MBBE) framework to automate the Design-Build-Test-Learn (DBTL) cycle in Synthetic Biology. Given their role in cell and gene therapy, with my team, we will focus on synthetic promoters for mammalian cells. Prompted by the recent successes and challenges of CAR T cells -immune cells engineered to kill cancer cells, we will use the framework to engineer a hypoxia-inducible promoter that optimises a set of criteria we will determine and prioritise with our collaborator Prof. Chen at UCLA.

We will first focus on the development of the MBBE framework; to this aim we will tackle the three issues mentioned above by: (a) developing a high-throughput microfluidic device that allows to infer, with minimum experimental efforts (via Optimal Experimental Design), reliable mathematical models of hundreds of variants of a promoter, (b) using these results to automatically learn/predict gene expression dynamics from promoter sequence via machine learning and (c) combining this prediction scheme with computational optimisation to identify and refine promoter sequences so that they satisfy given specifications and maximise pre-determined objectives.

To develop a hypoxia-inducible promoter, we will start from an initial pool of 600 sequences -designed to cover a fraction of the design space as big as possible, and we will iterate twice over our automatic DBTL loop to finally obtain promoter(s) that can be used to overcome the current limitations of CAR T cells.

Besides automating the DBTL cycle, the approach I propose has three main benefits: it allows to obtain, and publicly share, reliable models (1) faster -as we will use Optimal Experimental Design methods to minimise experimental efforts, (2) cost-effectively -as microfluidics drastically reduces the use of reagents and automation renders human intervention unnecessary; (3) in a reproducible way -as all the data and the steps in the inference are tracked and immediately made publicly available.

Planned Impact

I anticipate that this project will encompass a wide range of benefits and beneficiaries. I have structured expected impacts and beneficiaries in time frames: short, medium and long-term.

In the short term (1-5 years) academic research communities will be the primary beneficiary. Mammalian SynBio will benefit from the large set of thoroughly characterised promoters we will produce. The Control Engineering and Bio-Design Automation communities will benefit from the availability of "cheap" yet robust models that can be used, for example, to study how the dynamics of subsystems depend on cellular metabolism (i.e. metabolic burden). Large datasets of sequence/models will provide the Biophysics and Systems Biology communities with novel insights into both network dynamics and how the architecture of promoters determines the kinetics of transcription in mammalian cells. Annotated images and time-series will help researchers in Computer Vision/Machine Learning to develop/test new algorithms to study biological processes. Our industrial partners, Labcyte and Sphere Fluidics, as well as other early industrial adopters will be the secondary beneficiaries. The discussions I had with our partners make me believe that there is an opportunity to develop our results (with particular emphasis on the microfluidic device) into products and license them (e.g. to Sphere Fluidics). To maximise the impact of this project, within the first 5 years, I will spin out a company, "SynQuanta", that will specialise in developing technologies for "quantitative phenotyping" in synthetic biology.

In the medium term (5 to 10 years), this project will impact a wide range of industrial entities that are important for the British bioeconomy. The Edinburgh Genome Foundry is interested in integrating our platform in their assembly pipeline to provide a new "end-to-end" service to its customers. In addition to the impact of the platform, I envision that the hypoxia-inducible promoter we will build has the potential to widen the application of CAR-T cell therapy to non-solid cancers. I expect to be myself a beneficiary of this research as it will give me the chance to establish myself as a leader in the field at the boundary of Control Engineering and SynBio. I will seek to maximise these impacts disseminating my work, liaising with existing and new industrial partners thanks to the support of IBioIC and further integrating my research in training activities.

In the long term (10-50 years), industry and academia will equally benefit from this research. The most important impact foreseen for this project is the full automation of the Design-Build-Test-Cycle in mammalian Synthetic Biology. This will speed up significantly circuit construction and relieve a fundamental bottleneck of SynBio, unleashing its potential in both academia and industry. I anticipate that such developments will make it easier to engineer circuits with a predictable behaviour; these are the key enablers of the next generation of smart therapies for precision medicine. The Social Studies of Science (SSS) community will have the opportunity to study how key aspects of scientists' life (e.g. formation and communication of knowledge) are affected by automation. As we aim at automating "creative" processes (e.g. design/modelling), I am also excited to learn what social scientists will conclude after studying how platforms like ours will change the job market and the education of professionals. To facilitate impacts, we will keep seeking opportunities to engage with both academia and industry under the guidance of EI.
Title Crossing Kingdoms and Disciplines: Living Art 
Description In collaboration with Oron Catts and Tarsh Bates we developed an installation at the Edinburgh Science Festival that explores cross-kingdom cell fusion between yeast and mammalian cells. Specifically, we redesigned our microfluidic device to accommodate the imaging setup needed to capture the micrographs needed displayed as part of the installation itself. 
Type Of Art Artistic/Creative Exhibition 
Year Produced 2018 
Impact This installation sparked the interest of many commentators in the biotech space and triggered a lively discussion among artists and social scientists on the meaning of "belonging", as well as on the legitimacy of artists to explore it. The work received much attention from the media with interviews documented here: 
Description We have discovered a computational procedure that allows us to obtain the optimal design of characterisation experiments for the genetic constructs
Exploitation Route We have made the code associated to this procedure available through our Group GitHub account; this resource is now available for anyone to use.
Sectors Healthcare,Manufacturing, including Industrial Biotechology

Title Mask Effectiveness As Part Of COVID-19 Response - Droplet Ejection and Deposition Tests. Dataset 1 
Description "Face coverings and respiratory tract droplet dispersion" Supplementary Information. # Background: # Respiratory droplets are the primary transmission route for SARS-CoV-2; a principle which drives social distancing guidelines. Evidence suggests that virus transmission can be reduced by face coverings, but robust evidence for how mask usage might affect safe distancing parameters is lacking. Accordingly, we set out to quantify the effects of face coverings on respiratory tract droplet deposition. # Methods: # We tested an anatomically-realistic manikin head which ejected fluorescent droplets of water, and human volunteers, in speaking and coughing conditions without a face covering, with a surgical mask and/or a single layer cotton face covering. We quantified the number of droplets in flight using laser sheet illumination and UV-light for those that had landed at table height, from 0·25 m up to 2 m. For human volunteers, expiratory droplets were caught on a microscope slide 5 cm from the mouth. # Findings: # Whether manikin or human, wearing a face covering decreased the number of projected droplets by > 1000-fold. The effect was so marked that wearing a face mask rendered droplets virtually undetectable at any tested distance. We also estimated that a person standing 2 m from someone coughing without a mask is exposed to over 10,000 times more respiratory droplets than someone standing 0.5 m from someone wearing a basic single layer mask. # Interpretation: # Our results indicate that face coverings show consistent efficacy at blocking respiratory droplets and thus provide an opportunity to moderate social distancing policies. However, the methodologies we employed mostly detect larger (non-aerosol) sized droplets. Whilst SARS-CoV-2 is spread by respiratory droplets and the fomites they generate, the relative importance between these modes of transmission and true aerosol transmission is uncertain. If aerosol transmission is later determined to be a significant driver of infection, then our findings may overestimate the effectiveness of face coverings.Accordingly, we set out to quantify the effects of face coverings on respiratory tract droplet deposition. # Structure: # Face coverings and respiratory tract droplet dispersion" Supplementary Information. The information is grouped into four, zipped, datasets: Dataset 1: UV light test - droplets deposition images; Dataset 2: Microscopy tests - droplets deposition images; Dataset 3: Laser tests - droplets path images; Dataset 4: Shadow Imaging For further details, please download the ReadMe.txt file from each dataset. All the datasets are part of the Collection "Face coverings and respiratory tract droplet dispersion" . 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Title Optimal Experimental Design pipeline for promoter characterisation 
Description We published ( the software we use for the in-silico characterisation of genetic promoter. This software automatically identifies the best perturbation to be applied to cells in order to maximise the information extracted from the in vivo characterisation experiments we carry out in microfluidics. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact At the latest Conference on Decision and Control, the largest conference in the Control community, we were approached by a number of research groups working on biosystem modelling that told us that our algorithms helped them save significant amounts of resources on the characterisation experiments they routinely carry out.