Towards a Universal Biological-Cell Operating System (AUdACiOuS)

Lead Research Organisation: University of Nottingham
Department Name: School of Computer Science

Abstract

A living cell, e.g. a bacterium, is an information-processing machine. It is composed of a series of sub-systems that work in concert by sensing external stimuli, assessing its own internal states and making decisions through a network of complex and interlinked biological regulatory networks (BRN) motifs that act as the bacterium neural network. A bacterium's decision making processes often result in a variety of outputs, e.g. the creation of more cells, chemotaxis, bio-film formation, etc. It was recently shown that cells not only react to their environment but that they can even predict environmental changes. The emerging discipline of Synthetic Biology (SB), considers the cell to be a machine that can be built -from parts- in a manner similar to, e.g., electronic circuits, airplanes, etc. SB has sought to co-opt cells for nano-computation and nano-manufacturing purposes. During this leadership fellowship programme of research I will aim at making E.coli bacteria much more easily to program and hence harness for useful purposes. In order to achieve this, I plan to use the tools, methodologies and resources that computer science created for writing computer programs and find ways of making them useful in the microbiology laboratory.

Planned Impact

This proposal will have impact in four main areas.

General Dissemination Pathway: as any interdisciplinary research project we will aim at publishing in the very top specialised (Computer Science, Synthetic Biology) and generalist journals (Nature, Science, PNAS). Our work will also be disseminated through the respective conferences and we have planed a series of workshop that will help up showcase our work and make it available to other academics and industrialist.

Wider Impact, Exploitation & Knowledge Transfer Pathway: The technology I will develop during my fellowship is an *enabling* technology. As such it will open the doors for cooperation across a range of disciplines. Notably, we propose to transfer knowledge back and forth across three main areas of research, namely, computer science, synthetic biology and -for the first time- DNA and RNA origami. This could very well seed new research lines and have impact on fields as varied as biomedical applications, the in vivo construction of detection sensors, and smard drug delivery systems.

Intellectual Property Protection and Technology Transfer: We will actively seek routes of commercial exploitation of our research and intellectual property will be protected as to guarantee a return to UK PLC.


Society At Large Related Pathway: The research ideas in this proposal are groundbreaking and could herald a revolution in the way synthetic biology projects are undertaken lowering the barrier to usability, complexity and ultimatly, impacting on practical applications. As such, this technology may also raise important ethical, social, legal issues that must be explored in conjunction with relevant stake holders (public, government, industry, academy, NGOs, regulators, etc). I have established a rigorous pathway to impact that embraces ELSI as well as science, technology and societal considerations.

Publications

10 25 50
 
Description Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods.

We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process.


We present an implementation of an in vitro signal recorder based on DNA assembly and strand displacement. The signal recorder implements a stack data structure in which both data as well as operators are represented by single stranded DNA "bricks". The stack grows by adding push and write bricks and shrinks in last-in-first-out manner by adding pop and read bricks. We report the design of the signal recorder and its mode of operations and give experimental results from capillary electrophoresis as well as transmission electron microscopy that demonstrate the capability of the device to store and later release several successive signals. We conclude by discussing potential future improvements of our current results.


Unconventional computing is an area of research in which novel materials and paradigms are utilised to implement computation. Previously we have demonstrated how registers, logic gates and logic circuits can be implemented, unconventionally, with a biocompatible molecular switch, NitroBIPS, embedded in a polymer matrix. NitroBIPS and related molecules have been shown elsewhere to be capable of modifying many biological processes in a manner that is dependent on its molecular form. Thus, one possible application of this type of unconventional computing is to embed computational processes into biological systems. Here we expand on our earlier proof-of-principle work and demonstrate that universal computation can be implemented using NitroBIPS. We have previously shown that spatially localised computational elements, including registers and logic gates, can be produced. We explain how parallel registers can be implemented, then demonstrate an application of parallel registers in the form of Turing machine tapes, and demonstrate both parallel registers and logic circuits in the form of elementary cellular automata. The Turing machines and elementary cellular automata utilise the same samples and same hardware to implement their registers, logic gates and logic circuits; and both represent examples of universal computing paradigms. This shows that homogenous photochromic computational devices can be dynamically repurposed without invasive reconfiguration. The result represents an important, necessary step towards demonstrating the general feasibility of interfacial computation embedded in biological systems or other unconventional materials and environments.
Exploitation Route N/A
Sectors Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Title Combinatorial DNA Library Design Planner Web Server 
Description The webserver presented here provides solutions of near-minimal stages and thanks to almost instantaneous planning of DNA libraries it can be used as a metric of ?manufacturability? to guide DNA library design. Rapid planning remains applicable even for DNA library sizes vastly exceeding today's biochemical assembly methods, future-proofing our method. 
Type Of Technology Webtool/Application 
Year Produced 2014 
Impact -- 
URL http://www.dnald.org/planner/index.html
 
Title DNALD Planner Software 
Description De novo DNA synthesis is in need of new ideas for increasing production rate and reducing cost. DNA reuse in combinatorial library construction is one such idea. Here, we describe an algorithm for planning multistage assembly of DNA libraries with shared intermediates that greedily attempts to maximize DNA reuse, and show both theoretically and empirically that it runs in linear time. We compare solution quality and algorithmic performance to the best results reported for computing DNA assembly graphs, finding that our algorithm achieves solutions of equivalent quality but with dramatically shorter running times and substantially improved scalability. We also show that the related computational problem bounded-depth min-cost string production (BDMSP), which captures DNA library assembly operations with a simplified cost model, is NP-hard and APX-hard by reduction from vertex cover. The algorithm presented here provides solutions of near-minimal stages and thanks to almost instantaneous planning of DNA libraries it can be used as a metric of ?manufacturability? to guide DNA library design. Rapid planning remains applicable even for DNA library sizes vastly exceeding today's biochemical assembly methods, future-proofing our method. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact -- 
URL http://www.dnald.org/planner/ACS_sb-2013-00161v_SI.zip
 
Title EnrichNET 
Description EnrichNet is a network-based enrichment analysis method to identify functional associations between user-defined gene or protein sets and cellular pathways. The datasets are mapped onto a protein interaction network (or other user-defined molecular network) and their pairwise associations are assessed by computing a graph-based statistic, i.e. distances between the network nodes are mapped against a background model. In contrast to the classical overlap-based enrichment analysis, associations can also be identified for non-overlapping gene/protein sets and the user can investigate them in detail by visualizing corresponding sub-graphs. 
Type Of Technology Webtool/Application 
Year Produced 2012 
Impact -- 
URL http://www.enrichnet.org/
 
Title FuNeL - Functional Network Learning protocol 
Description FuNeL is a protocol to infer functional networks from machine learning models. It is a general approach that uses BioHEL, a rule-based evolutionary classifier, to describe the expression samples as a set of rules and then infers interactions between genes that act together within rules to predict the samples class. FuNeL generates co-prediction networks, that capture biological knowledge complementary to that contained in popular similarity-based co-expression networks. 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact -- 
URL http://ico2s.org/software/funel.html
 
Title Java Enrichment of Pathways Extended To Topology 
Description JEPETTO is a Cytoscape 3.x plugin which uses our servers: EnrichNet, PathExpand and TopoGSA to analyse a user-submitted human gene set. It identifies associations between genes and pathways using protein interaction network and topological analysis. JEPETTO performs two types of analysis. It can enrich a gene set with components from the pathways and processes closely connected in the interaction network or compare topological properties of the gene set interactions to properties of known cellular pathways and processes. The plugin allows users to infer information about biological regulatory mechanisms and identify putative new co-factors that would not have been detected using the standard term-based analysis. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact -- 
URL http://ico2s.org/addons/jepetto.html
 
Title Server for the prediction of structural aspects of protein residues 
Description PSP server contains a collection of web services that address several protein structure prediction (PSP) sub-problems. Each of these sub-problems focuses on a single structural feature of a protein and the PSP server is using a Learning Classifier System to predict them for a given sequence of amino acids. The structural features predicted by the PSP server are: density of packing of different parts of a protein, how buried/exposed, far/close to the surface are different residues within a protein, classic measures such as coordination number or solvent accessibility, metrics derived from modelling the protein structure using topological and geometrical properties. 
Type Of Technology Webtool/Application 
Year Produced 2012 
Impact -- 
URL http://cruncher.ncl.ac.uk/psp/prediction/action/home
 
Title The Infobiotics Workbench 
Description The Infobiotics Workbench is a executable biology framework implementing multi-compartmental stochastic and deterministic simulation, formal model analysis and structural/parameter model optimisation for computational systems and synthetic biology. The Infobiotics Workbench is comprised of the following components: a modelling language based on P systems which allows modular and parsimonious multi-cellular model development where the outermost compartments can be positioned in 2-dimensional space to facilitate modelling at either extra-, inter- or intracellular levels of detail deterministic and stochastic simulator using algorithms optimised for large multi-compartmental systems (the simulator also accept a subset of SBML, allowing for visual model specification using tools such as CellDesigner) formal model analysis for the study of temporal and spatial model properties supported the model checkers PRISM and MC2 model structure and parameter optimisation using a variety of evolutionary and population-based algorithms to automatically generate models whose dynamics match specified target timeseries a user-friendly front-end for performing in-silico experiments, plotting and visualisation of simulations with many runs and compartments 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact -- 
URL http://ico2s.org/software/infobiotics.html