Genestorian: a web application to document and trace genetic modifications in model organism and cell line collections.
Lead Research Organisation:
UNIVERSITY COLLEGE LONDON
Department Name: Genetics Evolution and Environment
Abstract
Currently, no open standards exist to unambiguously describe cloning strategies, genotypes and allele inheritance. Consequently, laboratories often store their plasmids, cell lines and strains in spreadsheets or text-based systems, which are necessarily inconsistent and differ between collections. Therefore, for curators or even members of a laboratory, it can be time-consuming or impossible to know the sequence and provenance of a plasmid or allele. Since biological knowledge bases (UniProt, model organism databases, etc.) rely on links between gene variants and phenotypes to annotate functions to gene products, the current situation limits the reusability of biological resources and the broad impact of research.
I propose to develop Genestorian, a web application to manage collections of oligonucleotides, plasmids, strains and cell lines where sequences can be traced through cloning steps up to their entry into the collection. Researchers will plan the generation of new resources from existing ones, with new sequences generated by in silico cloning. Consequently, data standardisation will occur at the planning stage, and will not be a burden at submission stages. It will be easy to query the collection, access the sequence, ancestry and progeny of resources, and export this information for the methods section of a paper or in a standard format. Standardisation will enable information exchange between laboratory collections, journals, knowledge bases and resource repositories. Therefore, Genestorian aligns with the European Union commitment to Open Science and will promote resource reusability and maximise the impact of genetic research, facilitating its reproducibility and interpretation by tracing results to specific DNA sequences.
I propose to develop Genestorian, a web application to manage collections of oligonucleotides, plasmids, strains and cell lines where sequences can be traced through cloning steps up to their entry into the collection. Researchers will plan the generation of new resources from existing ones, with new sequences generated by in silico cloning. Consequently, data standardisation will occur at the planning stage, and will not be a burden at submission stages. It will be easy to query the collection, access the sequence, ancestry and progeny of resources, and export this information for the methods section of a paper or in a standard format. Standardisation will enable information exchange between laboratory collections, journals, knowledge bases and resource repositories. Therefore, Genestorian aligns with the European Union commitment to Open Science and will promote resource reusability and maximise the impact of genetic research, facilitating its reproducibility and interpretation by tracing results to specific DNA sequences.
Organisations
| Title | A database of annotated plasmids in the iGEM 2024 distribution |
| Description | The 2024 iGEM plasmid distribution provides teams with essential genetic parts for synthetic biology projects. However, these plasmids are distributed as raw DNA sequences without detailed sequence annotations, which identify functional elements such as genes, regulatory regions, and cloning features. To address this, we created a repository that annotates these plasmids using Plannotate, a tool for automated sequence annotation. By making these annotated plasmids available, we help researchers and iGEM teams quickly interpret plasmid functions, design experiments more efficiently, and reduce errors in genetic engineering workflows. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This resource enhances accessibility and usability of the 2024 iGEM distribution for the synthetic biology community. In addition, the annotated plasmids can be directly be accessed from the web application funded by this grant. |
| URL | https://github.com/manulera/annotated-igem-distribution |
| Title | A database of plasmids containing gateway cloning sites |
| Description | Data mining software project where AddGene plasmids containing Gateway Cloning sites where downloaded and categorised producing a searchable database. The data was then used to produce consensus sites for each type of Gateway site. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | Consensus sites produced from this data are used to simulate Gateway cloning in the web application funded by this grant. In addition, the site offers a portal to explore the dataset and find plasmids based on the features present in them. |
| URL | https://github.com/manulera/GateWayMine |
| Title | OpenCloning, a web application to plan and document cloning strategies |
| Description | OpenCloning is an Open-Source web application to plan and document cloning. Users can: 1. Import plasmid sequences from AddGene and gene sequences from NCBI. 2. Load their own sequence files. 3. Plan cloning and design primers using common techniques (Gibson, golden gate, gateway, etc.). 4. Plan strain and cell line engineering via CRISPR and homologous recombination, with use-cases not supported by SnapGene or Benchling. 5. Automate repetitive cloning and primer design using scripts or web forms. 6. Download final constructs as GenBank or FASTA files. 7. Archive the entire cloning history in an Open format and load it later. 8. Create reusable cloning templates for cloning kits. |
| Type Of Technology | Webtool/Application |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This web application is the main output planned from this funding. It is already available for researchers to plan and document their experiments. It currently supports most cloning methods supported by proprietary alternatives, even including some methods not supported by proprietary tools. It also allows users to export the plan of their experiment in an open format, which is not supported by proprietary software. |
| URL | https://github.com/manulera/OpenCloning |
| Title | pLannotate Web API and Docker Integration |
| Description | Sequence annotation is a critical step in synthetic biology, helping researchers identify functional elements within DNA sequences. Plannotate is a powerful tool for automated sequence annotation, using it currently requires local installation and command-line expertise and is not easy to integrate in a pipeline or use in a production-level web application. To make Plannotate more accessible, I developed a web API that allows other applications to integrate its functionality seamlessly. Additionally, we created a containerized Docker version, ensuring easy deployment and reproducibility across different computing environments. This work lowers the barrier for researchers and developers, enabling broader adoption of automated sequence annotation in synthetic biology workflows. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Impact | This enabled the web application funded by this grant to integrate with this existing software package. |
| URL | https://github.com/manulera/pLannotate-api-docker |
| Description | Lead role in organisation of Synthetic Biology afterwork events |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Postgraduate students |
| Results and Impact | Along with 4 other colleagues from UCL and Imperial College, I started a montly seminar series "London SynBio Network". The format is an afterwork event consisting of two 20 minutes talks followed by networking with drinks and snacks. The main audience is Early Career Researchers and industry members interested in Synthetic Biology. So far, we have organised 6 events with an average registration of 80 people. These events have helped me meet prospective users of the web application that this grant funds. |
| Year(s) Of Engagement Activity | 2024,2025 |
| URL | https://events.humanitix.com/copy-of-london-synbio-network-6 |
| Description | Lead role in organisation of python library hackathon and monthly meetings |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Other audiences |
| Results and Impact | As part of my work in the web application supported by this grant, I take part in the maintenance of the python library pydna. Pydna is a python package that provides a human-readable formal descriptions of cloning and genetic assembly strategies in Python for simulation and verification. Pydna can be used as executable documentation for cloning. I have taken a leading role in activating the community of users by: - Organising monthly meetings with pro-users and developers. - Organising a one day pydna "hackathon" The typical attendance of the monthly meetings is 6 people, and 12 people participated in the hackathon. These activities have resulted in the creation of a small community of maintainers and users of the library that know each other and has resulted in an overall improvement of the library including bug fixes, documentation and better software development practices. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://github.com/pydna-group/pydna |
