📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Genestorian: a web application to document and trace genetic modifications in model organism and cell line collections.

Lead Research Organisation: UNIVERSITY COLLEGE LONDON
Department Name: Genetics Evolution and Environment

Abstract

Currently, no open standards exist to unambiguously describe cloning strategies, genotypes and allele inheritance. Consequently, laboratories often store their plasmids, cell lines and strains in spreadsheets or text-based systems, which are necessarily inconsistent and differ between collections. Therefore, for curators or even members of a laboratory, it can be time-consuming or impossible to know the sequence and provenance of a plasmid or allele. Since biological knowledge bases (UniProt, model organism databases, etc.) rely on links between gene variants and phenotypes to annotate functions to gene products, the current situation limits the reusability of biological resources and the broad impact of research.

I propose to develop Genestorian, a web application to manage collections of oligonucleotides, plasmids, strains and cell lines where sequences can be traced through cloning steps up to their entry into the collection. Researchers will plan the generation of new resources from existing ones, with new sequences generated by in silico cloning. Consequently, data standardisation will occur at the planning stage, and will not be a burden at submission stages. It will be easy to query the collection, access the sequence, ancestry and progeny of resources, and export this information for the methods section of a paper or in a standard format. Standardisation will enable information exchange between laboratory collections, journals, knowledge bases and resource repositories. Therefore, Genestorian aligns with the European Union commitment to Open Science and will promote resource reusability and maximise the impact of genetic research, facilitating its reproducibility and interpretation by tracing results to specific DNA sequences.

Publications

10 25 50
 
Title A database of annotated plasmids in the iGEM 2024 distribution 
Description The 2024 iGEM plasmid distribution provides teams with essential genetic parts for synthetic biology projects. However, these plasmids are distributed as raw DNA sequences without detailed sequence annotations, which identify functional elements such as genes, regulatory regions, and cloning features. To address this, we created a repository that annotates these plasmids using Plannotate, a tool for automated sequence annotation. By making these annotated plasmids available, we help researchers and iGEM teams quickly interpret plasmid functions, design experiments more efficiently, and reduce errors in genetic engineering workflows. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact This resource enhances accessibility and usability of the 2024 iGEM distribution for the synthetic biology community. In addition, the annotated plasmids can be directly be accessed from the web application funded by this grant. 
URL https://github.com/manulera/annotated-igem-distribution
 
Title A database of plasmids containing gateway cloning sites 
Description Data mining software project where AddGene plasmids containing Gateway Cloning sites where downloaded and categorised producing a searchable database. The data was then used to produce consensus sites for each type of Gateway site. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact Consensus sites produced from this data are used to simulate Gateway cloning in the web application funded by this grant. In addition, the site offers a portal to explore the dataset and find plasmids based on the features present in them. 
URL https://github.com/manulera/GateWayMine
 
Title OpenCloning, a web application to plan and document cloning strategies 
Description OpenCloning is an Open-Source web application to plan and document cloning. Users can: 1. Import plasmid sequences from AddGene and gene sequences from NCBI. 2. Load their own sequence files. 3. Plan cloning and design primers using common techniques (Gibson, golden gate, gateway, etc.). 4. Plan strain and cell line engineering via CRISPR and homologous recombination, with use-cases not supported by SnapGene or Benchling. 5. Automate repetitive cloning and primer design using scripts or web forms. 6. Download final constructs as GenBank or FASTA files. 7. Archive the entire cloning history in an Open format and load it later. 8. Create reusable cloning templates for cloning kits. 
Type Of Technology Webtool/Application 
Year Produced 2024 
Open Source License? Yes  
Impact This web application is the main output planned from this funding. It is already available for researchers to plan and document their experiments. It currently supports most cloning methods supported by proprietary alternatives, even including some methods not supported by proprietary tools. It also allows users to export the plan of their experiment in an open format, which is not supported by proprietary software. 
URL https://github.com/manulera/OpenCloning
 
Title pLannotate Web API and Docker Integration 
Description Sequence annotation is a critical step in synthetic biology, helping researchers identify functional elements within DNA sequences. Plannotate is a powerful tool for automated sequence annotation, using it currently requires local installation and command-line expertise and is not easy to integrate in a pipeline or use in a production-level web application. To make Plannotate more accessible, I developed a web API that allows other applications to integrate its functionality seamlessly. Additionally, we created a containerized Docker version, ensuring easy deployment and reproducibility across different computing environments. This work lowers the barrier for researchers and developers, enabling broader adoption of automated sequence annotation in synthetic biology workflows. 
Type Of Technology Software 
Year Produced 2024 
Impact This enabled the web application funded by this grant to integrate with this existing software package. 
URL https://github.com/manulera/pLannotate-api-docker
 
Description Lead role in organisation of Synthetic Biology afterwork events 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Along with 4 other colleagues from UCL and Imperial College, I started a montly seminar series "London SynBio Network".

The format is an afterwork event consisting of two 20 minutes talks followed by networking with drinks and snacks. The main audience is Early Career Researchers and industry members interested in Synthetic Biology. So far, we have organised 6 events with an average registration of 80 people.

These events have helped me meet prospective users of the web application that this grant funds.
Year(s) Of Engagement Activity 2024,2025
URL https://events.humanitix.com/copy-of-london-synbio-network-6
 
Description Lead role in organisation of python library hackathon and monthly meetings 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact As part of my work in the web application supported by this grant, I take part in the maintenance of the python library pydna. Pydna is a python package that provides a human-readable formal descriptions of cloning and genetic assembly strategies in Python for simulation and verification. Pydna can be used as executable documentation for cloning.

I have taken a leading role in activating the community of users by:

- Organising monthly meetings with pro-users and developers.
- Organising a one day pydna "hackathon"

The typical attendance of the monthly meetings is 6 people, and 12 people participated in the hackathon.

These activities have resulted in the creation of a small community of maintainers and users of the library that know each other and has resulted in an overall improvement of the library including bug fixes, documentation and better software development practices.
Year(s) Of Engagement Activity 2024
URL https://github.com/pydna-group/pydna