Genome Annotation for the Masses
Lead Research Organisation:
Queen Mary University of London
Department Name: Sch of Biological and Chemical Sciences
Abstract
The hereditary information carried by each living thing is its genome. Stored in the form of the DNA sequences of As, Cs, Gs, and Ts, between 1 and 5% of the genome sequence consists in genes. These genes contain instruction sets for small protein machines that accomplish specific tasks and ultimately determine the organism's shape, size, behavior, lifespan and disease susceptibility.
Determining the genome sequence of an organism is now straightforward. But understanding which genes are responsible for the unique characteristics of the organism remains challenging. This is due in particular to the difficulty of correctly finding the genes in the genome and determining which parts of their sequence encode proteins. Indeed, automatic gene identification software performs poorly, thus evidence for each potential gene model needs to be visually inspected and corrected. Thus preparing the data for even a small research project can take months.
Luckily there is a solution. Thousands of members of the general public have used the internet to contribute their time to help scientific projects such as GalaxyZoo and FoldIt, be it out of curiosity, desire to help the greater good, gain peer recognition or simply to have fun. Results of their contributions include the identification of previously unknown galaxy types and determination of the 3D structures of AIDS proteins.
The proposed project uses a similar approach to encourage members of the general public to help identify genes in the genome and refine their borders. We are constructing a game in which contributors use pattern recognition skills to improve gene models. Contributors will be able to choose to focus their efforts on particular species (e.g.: ants, humans, elephants) or research topics (e.g.: cancer, immunity, longevity, taste or odor perception, behavior). They will earn points and thus peer recognition for their contribtutions, and may be acknowledged in scientific publications or even financially compensated.
This project will thus allow members of the general public to have fun while helping to make the world a better place and facilitate scientific discovery.
Determining the genome sequence of an organism is now straightforward. But understanding which genes are responsible for the unique characteristics of the organism remains challenging. This is due in particular to the difficulty of correctly finding the genes in the genome and determining which parts of their sequence encode proteins. Indeed, automatic gene identification software performs poorly, thus evidence for each potential gene model needs to be visually inspected and corrected. Thus preparing the data for even a small research project can take months.
Luckily there is a solution. Thousands of members of the general public have used the internet to contribute their time to help scientific projects such as GalaxyZoo and FoldIt, be it out of curiosity, desire to help the greater good, gain peer recognition or simply to have fun. Results of their contributions include the identification of previously unknown galaxy types and determination of the 3D structures of AIDS proteins.
The proposed project uses a similar approach to encourage members of the general public to help identify genes in the genome and refine their borders. We are constructing a game in which contributors use pattern recognition skills to improve gene models. Contributors will be able to choose to focus their efforts on particular species (e.g.: ants, humans, elephants) or research topics (e.g.: cancer, immunity, longevity, taste or odor perception, behavior). They will earn points and thus peer recognition for their contribtutions, and may be acknowledged in scientific publications or even financially compensated.
This project will thus allow members of the general public to have fun while helping to make the world a better place and facilitate scientific discovery.
Technical Summary
Genomes of emerging model organisms are now be sequenced at almost no cost. The major bottleneck has become obtaining accurate gene models because automated gene prediction programs incorrectly predict start sites, intron-exon boundaries and may even miss or merge whole genes even if large amounts or RNA sequence are available. Fixing and refining gene models is thus required before rigorous analyses can be performed. However, refining a single gene model can take up to several hours and thus remains difficult to justify beyond exceptional cases.
Tasks from other research areas that require human brainpower but are similarly repetitive have been successfully crowd-sourced to members of the general public. GalaxyZoo volunteers have categorized millions of photos of galaxies and thus triggered the characterization of multiple previously unknown galaxy types and other stellar objects. Similarly, players of the FoldIt game earn points by minimizing the free energy of putative protein structures and in some cases perform better than specialized structure prediction algorithms or even expert protein modelers.
Contributors to such projects may be motivated by the intellectual challenge, the desire to learn new skills, to contribute to the greater good, to compete or earn recognition among peers, or in some cases even to earn small amounts of financial compensation. The project proposed here takes inspiration from such crowd-sourcing initiatives. We aim to create an online game to crowd-sources gene model refinement. In doing this our game will provide a key service to biologists by rapidly generating high-quality gene annotations at little or no cost.
Tasks from other research areas that require human brainpower but are similarly repetitive have been successfully crowd-sourced to members of the general public. GalaxyZoo volunteers have categorized millions of photos of galaxies and thus triggered the characterization of multiple previously unknown galaxy types and other stellar objects. Similarly, players of the FoldIt game earn points by minimizing the free energy of putative protein structures and in some cases perform better than specialized structure prediction algorithms or even expert protein modelers.
Contributors to such projects may be motivated by the intellectual challenge, the desire to learn new skills, to contribute to the greater good, to compete or earn recognition among peers, or in some cases even to earn small amounts of financial compensation. The project proposed here takes inspiration from such crowd-sourcing initiatives. We aim to create an online game to crowd-sources gene model refinement. In doing this our game will provide a key service to biologists by rapidly generating high-quality gene annotations at little or no cost.
Planned Impact
Members of the general public who use our software will learn new biological knowledge and skills. This capacity building will occur thanks to use of educational material we put on the website and to the thought processes required for refining gene models. Additionally, the visibility this project obtains among contributors and the general public will increase public engagement with biological research.
Visibility will be obtained:
* initially through a small online advertising campaign and use of our tool in coursework, and subsequently by strongly encouraging users to advertise their participation to peers on social networks such as Facebook,
* through our international, interdisciplinary team of collaborators,
* thanks to the public relations office at Queen Mary University of London, the Swiss Institute of Bioinformatics (SwissProt) and other collaborating institutes.
Our project will also:
* contribute toward changing organization culture and practices by showing that crowdsourcing practices work,
* accelerate discoveries in fundamental bioscience including those relating to food security and improving human quality of life and health,
* improve the effectiveness of researchers thus indirectly improving society.
Visibility will be obtained:
* initially through a small online advertising campaign and use of our tool in coursework, and subsequently by strongly encouraging users to advertise their participation to peers on social networks such as Facebook,
* through our international, interdisciplinary team of collaborators,
* thanks to the public relations office at Queen Mary University of London, the Swiss Institute of Bioinformatics (SwissProt) and other collaborating institutes.
Our project will also:
* contribute toward changing organization culture and practices by showing that crowdsourcing practices work,
* accelerate discoveries in fundamental bioscience including those relating to food security and improving human quality of life and health,
* improve the effectiveness of researchers thus indirectly improving society.
Organisations
- Queen Mary University of London (Lead Research Organisation)
- Natural History Museum (Collaboration)
- Lawrence Berkeley National Laboratory (Collaboration)
- University of St Andrews (Collaboration)
- UK CENTRE FOR ECOLOGY & HYDROLOGY (Collaboration)
- National Science Foundation (NSF) (Collaboration)
- Commonwealth Scientific and Industrial Research Organisation (Collaboration)
- UNIVERSITY OF OXFORD (Collaboration)
- Alan Turing Institute (Collaboration)
- Cardiff University (Collaboration)
- Swiss Institute of Bioinformatics (SIB) (Collaboration)
- Swiss Institute of Bioinformatics (Project Partner)
- Ontario Institute for Cancer Research (Project Partner)
Publications
Wurm Y
(2015)
Arthropod genomics beyond fruit flies: bridging the gap between proximate and ultimate causation.
in Briefings in functional genomics
Wang J
(2013)
A Y-like social chromosome causes alternative colony organization in fire ants
in Nature
Wallom D
(2015)
Desktop as a Service Supporting Environmental 'omics
Stolle E
(2022)
Recurring adaptive introgression of a supergene variant that determines social organization.
in Nature communications
Stolle E
(2018)
Degenerative expansion of a young supergene
Stolle E
(2019)
Degenerative Expansion of a Young Supergene.
in Molecular biology and evolution
Simola DF
(2013)
Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality.
in Genome research
Semmens DC
(2016)
Transcriptomic identification of starfish neuropeptide precursors yields new insights into neuropeptide evolution.
in Open biology
Schrader L
(2014)
Transposable element islands facilitate adaptation to novel environments in an invasive species.
in Nature communications
Purcell J
(2014)
Convergent genetic architecture underlies social organization in ants.
in Current biology : CB
Description | Having accurate gene predictions isessential for much modern biological research. Unfortunatlyonly possible after visual inspection and manual fixing(curation). This makes projects requiring high quality predictions for thousands of genes impossible beyond work on humans and fruit flies. We built a tool aims to bring gene feature visualisation and improvement to a larger group of people. With this "crowd-sourcing" approach, we obtain improved gene predictions (which thus improves analyses that depend on them) and 2. educate contributors (currently university-level students). The software is currently able to offer a complete crowd-sourcing approach for contributors who already have some basic biological knowledge. We hope to expand it at at least three levels: 1. so that it is used in other institutions. 2. to better deal with complex gene predictions (when contributors provide conflicting information) 3. to reduce the learning curve difficulty by improving tutorials for non-biologists. Additionally, we have created and published a tool that helps visualise problems with gene predictions (genevalidator). |
Exploitation Route | We have begun to collaborate with others who want to build upon our approach to 1. improve teaching curricula and 2. improve gene prediction quality 3. add more biorelevance to crowd-sourcing initiatives. We have received a small grant (Drapers' Fund for Innovation in Learning and Teaching; 5000 GBP) to push key features of this further. |
Sectors | Digital/Communication/Information Technologies (including Software) Education Other |
URL | http://afra.sbcs.qmul.ac.uk |
Description | Thanks to the 10,000-fold drop in DNA sequencing costs since 2007, it is far easier to obtain a genome sequencing than before. Obtaining high quality gene predictions remains complex as individual gene predictions need to be verified and often improved by humans. We have developed a basic software to "crowd-source" gene prediction verification and improvement. We have already used it as part of educating undergraduate and masters-level students to teach them 1. about gene structure 2. the tradeoffs in automated analysis 3. comparative genomics. This is being used in multiple institutions world-wide. While the students learn they are contributing to research, in particular having contributed improved gene models for fire ant genomes. We are reaching out to other communities that will be able to take advantage of this tool. Furthermore, our project is open source and the computer code has already been used in several additional projects. |
First Year Of Impact | 2013 |
Sector | Digital/Communication/Information Technologies (including Software),Education,Other |
Impact Types | Cultural Societal |
Description | Nescent working group - curriculum development |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Software Development best practices in bioinformatics |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | We have actively advocated for the respect of best practices for software development in scientific research. This is inline with partner efforts at http://software.ac.uk. This has broad impacts throughout the sciences in which software are used (i.e. almost all of them!), and in particular in genomics/bioinformatics where such approaches remain undervalues (and the potential risk of not pursuing best practices is not yet widely known). |
Description | BBSRC NPIF Case Studentship |
Amount | £107,034 (GBP) |
Funding ID | BB/S507556/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 12/2018 |
End | 11/2022 |
Description | Google Summer of Code |
Amount | $5,500 (USD) |
Funding ID | Monica Dragan |
Organisation | |
Department | Google Summer of Code |
Sector | Charity/Non Profit |
Country | United States |
Start | 04/2013 |
End | 10/2013 |
Description | Google Summer of Code - Hiten |
Amount | $5,000 (USD) |
Organisation | |
Department | Google Summer of Code |
Sector | Charity/Non Profit |
Country | United States |
Start | 05/2016 |
End | 09/2016 |
Description | Google Summer of Code - Julian Mazzitelli |
Amount | $5,000 (USD) |
Organisation | |
Department | Google Summer of Code |
Sector | Charity/Non Profit |
Country | United States |
Start | 05/2016 |
End | 09/2016 |
Description | Marie Sklodowska Curie Incoming Fellowship H2020-MSCA-IF-2018 |
Amount | € 224,933 (EUR) |
Organisation | European Commission H2020 |
Sector | Public |
Country | Belgium |
Start | 09/2019 |
End | 10/2021 |
Description | Marie Sklodowska Curie Incoming Fellowship H2020-MSCA-IF-2018 (another) |
Amount | € 212,933 (EUR) |
Funding ID | EvolvAnt |
Organisation | European Commission H2020 |
Sector | Public |
Country | Belgium |
Start | 03/2020 |
End | 02/2022 |
Description | Marie curie |
Amount | € 221,606 (EUR) |
Funding ID | 623713 |
Organisation | European Commission |
Department | Seventh Framework Programme (FP7) |
Sector | Public |
Country | European Union (EU) |
Start | 02/2015 |
End | 02/2017 |
Description | NE/P012574/1 |
Amount | £648,559 (GBP) |
Funding ID | NE/P012574/1 |
Organisation | Natural Environment Research Council |
Sector | Public |
Country | United Kingdom |
Start | 05/2017 |
End | 04/2020 |
Description | NERC big capital |
Amount | £500,000 (GBP) |
Organisation | Natural Environment Research Council |
Sector | Public |
Country | United Kingdom |
Start | 08/2013 |
End | 03/2015 |
Description | Nescent working group |
Amount | $50,000 (USD) |
Organisation | National Science Foundation (NSF) |
Department | National Evolutionary Synthesis Center |
Sector | Academic/University |
Country | United States |
Start | 01/2013 |
End | 11/2015 |
Description | QMUL - Drapers' Fund for Innovation in Learning and Teaching |
Amount | £5,000 (GBP) |
Organisation | Queen Mary University of London |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2017 |
End | 07/2017 |
Description | Software sustainability Fellowship |
Amount | £3,000 (GBP) |
Organisation | University of Edinburgh |
Department | UK Software Sustainability Institute |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2013 |
End | 03/2015 |
Title | Afra: Gene curation crowdsourcing platform |
Description | See software: afra |
Type Of Material | Technology assay or reagent |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | See software: afra |
URL | Http://afra.sbcs.qmul.ac.uk |
Title | Bionode |
Description | See software: Bionode |
Type Of Material | Technology assay or reagent |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | See software: Bionode |
URL | Http://www.bionode.io |
Title | Flo |
Description | Software: flo to transfer gene predictions from one genome assembly to another genome assembly (from same species) |
Type Of Material | Technology assay or reagent |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Makes it easier to use new (higher quality) genome assemblies |
URL | https://github.com/wurmlab/flo |
Title | Genevalidator |
Description | See software: Genevalidator |
Type Of Material | Technology assay or reagent |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | SEe software genevalidator |
URL | http://genevalidator.sbcs.qmul.ac.uk |
Title | Additional file 1: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S1. Comparison of assembly statistics of the F. exsecta genome and 13 other published ant genomes. (XLSX 11 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_1_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 1: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S1. Comparison of assembly statistics of the F. exsecta genome and 13 other published ant genomes. (XLSX 11 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_1_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 2: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S2. List of genes specific to the Formicinae as identified by OrthoVenn. (XLSX 20 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_2_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 2: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S2. List of genes specific to the Formicinae as identified by OrthoVenn. (XLSX 20 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_2_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 3: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S3. List of species-specific genes in F. exsecta, as identified by OrthoVenn. (XLSX 24 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_3_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 3: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S3. List of species-specific genes in F. exsecta, as identified by OrthoVenn. (XLSX 24 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_3_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 4: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S4. List of F. exsecta genes under positive or relaxed purifying selection (dN/dS ratios >â 1) in comparison to five other ant species (Camponotus floridanus, Lasius niger, Solenopsis invicta and Cerapachys biroi) (XLSX 115 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_4_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 4: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S4. List of F. exsecta genes under positive or relaxed purifying selection (dN/dS ratios >â 1) in comparison to five other ant species (Camponotus floridanus, Lasius niger, Solenopsis invicta and Cerapachys biroi) (XLSX 115 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_4_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 5: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S5. List of F. exsecta genes showing dN/dS ratios >â 1 in pairwise comparison to Camponotus floridanus. (XLSX 11 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_5_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 5: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S5. List of F. exsecta genes showing dN/dS ratios >â 1 in pairwise comparison to Camponotus floridanus. (XLSX 11 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_5_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 7: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S6. List of genes with paralogs in the wFex genome, which are present as single copies in the wMel, wRi, wDac genomes. (XLSX 27 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_7_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 7: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S6. List of genes with paralogs in the wFex genome, which are present as single copies in the wMel, wRi, wDac genomes. (XLSX 27 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_7_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 8: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S7. List of conserved Wolbachia genes used for phylogenetic analysis. (XLSX 14 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_8_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 8: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S7. List of conserved Wolbachia genes used for phylogenetic analysis. (XLSX 14 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_8_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 9: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S8. List of species-specific genes in wFex genome, as identified by OrthoVenn. (XLSX 15 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_9_of_The_first_draft_genomes_of_the_ant... |
Title | Additional file 9: of The first draft genomes of the ant Formica exsecta, and its Wolbachia endosymbiont reveal extensive gene transfer from endosymbiont to host |
Description | Table S8. List of species-specific genes in wFex genome, as identified by OrthoVenn. (XLSX 15 kb) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
URL | https://springernature.figshare.com/articles/Additional_file_9_of_The_first_draft_genomes_of_the_ant... |
Title | Data from: The fire ant social chromosome supergene variant Sb shows low diversity but high divergence from SB |
Description | Variation in social behavior is common yet little is known about the genetic architectures underpinning its evolution. A rare exception is in the fire ant Solenopsis invicta: Alternative variants of a supergene region determine whether a colony will have exactly one or up to dozens of queens. The two variants of this region are carried by a pair of "social chromosomes", SB and Sb, which resemble a pair of sex chromosomes. Recombination is suppressed between the two chromosomes in the supergene region. While the X-like SB can recombine with itself in SB/SB queens, recombination is effectively absent in the Y-like Sb because Sb/Sb queens die before reproducing. Here, we analyze whole genome sequences of eight haploid SB males and eight haploid Sb males. We find extensive SB-Sb di?erentiation throughout the >19Mb long supergene region. We find no evidence of "evolutionary strata" with different levels of divergence comparable to those reported in several sex chromosomes. A high proportion of substitutions between the SB and Sb haplotypes are nonsynonymous, suggesting inefficacy of purifying selection in Sb sequences, similar to that for Y-linked sequences in XY systems. Finally, we show that the Sb haplotype of the supergene region has 635-fold less nucleotide diversity than the rest of the genome. We discuss how this reduction could be due to a recent selective sweep affecting Sb specifically or associated with a population bottleneck during the invasion of North America by the sampled population. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
URL | https://datadryad.org/stash/dataset/doi:10.5061/dryad.js509 |
Description | Bioinformatics for the classroom - Raspberry Pi |
Organisation | University of St Andrews |
Department | School of Biology |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Joint project development - not yet funded. |
Collaborator Contribution | Joint project development - not yet funded. |
Impact | Grant development to bring bioinformatics skills to high-school level students- not yet funded |
Start Year | 2013 |
Description | Chris Dessimoz |
Organisation | Swiss Institute of Bioinformatics (SIB) |
Country | Switzerland |
Sector | Charity/Non Profit |
PI Contribution | I initiated collaboration to obtain new phd funds |
Collaborator Contribution | Engaged constructively. Has extensive expertise needed for joint project |
Impact | Recently funded BBSRC NPIF grant |
Start Year | 2015 |
Description | Collab Marc Robinson Rechavi |
Organisation | Swiss Institute of Bioinformatics (SIB) |
Country | Switzerland |
Sector | Charity/Non Profit |
PI Contribution | New collaboration - we obtained samples, dissected, extracted RNA. We are leading bioinformatic analysis |
Collaborator Contribution | Partner contribtued funds for field sampling (which we did), and for gene expression sequencing. They are helping with bioinformatic analysis |
Impact | not yet |
Start Year | 2016 |
Description | Fellowship at Alan Turing Institue for data science and artificial intelligence |
Organisation | Alan Turing Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I am a fellow - interacting with data-centric peers from other fields |
Collaborator Contribution | Expertise of others in data science techniques - carrying over expertise into our research. -> synergistic grant application and project ideas. |
Impact | Collaborative BBSRC grant submission |
Start Year | 2018 |
Description | NERC EOS Cloud |
Organisation | Cardiff University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Collaborator Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Impact | In progress. But we are in talks with related projects funded by other RCUK members. |
Start Year | 2014 |
Description | NERC EOS Cloud |
Organisation | UK Centre for Ecology & Hydrology |
Country | United Kingdom |
Sector | Public |
PI Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Collaborator Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Impact | In progress. But we are in talks with related projects funded by other RCUK members. |
Start Year | 2014 |
Description | NERC EOS Cloud |
Organisation | University of Oxford |
Department | Oxford E-Research Centre |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Collaborator Contribution | Joint grant proposal (NERC) put together and awarded (500,000) - including 85k for our side. Partners contributed equally in terms of vision development - more of the leadership in terms of implementation came from our partners. We are developing a sub aspect of the project. |
Impact | In progress. But we are in talks with related projects funded by other RCUK members. |
Start Year | 2014 |
Description | NHM studentship |
Organisation | Natural History Museum |
Country | United Kingdom |
Sector | Public |
PI Contribution | New collaboration to develop molecular approaches for environmental sensing |
Collaborator Contribution | New collaboration to develop molecular approaches for environmental sensing |
Impact | Recruiting 4 year PhD student |
Start Year | 2024 |
Description | Nescent: Building non-model species genome curation communities |
Organisation | Commonwealth Scientific and Industrial Research Organisation |
Country | Australia |
Sector | Public |
PI Contribution | We have contributed software, ideas, meeting time (3 one week workshops), writing time. |
Collaborator Contribution | The workshops were oriented around curriculum development & software need identification for genomics on emerging model organisms. The partners contributed code, writing, documentation & ideas. This was extremely productive and helped our project in terms of feedback, in terms of in-kind contributions (code), in terms of effort optimisation (the partners include developers of Apollo on which our work is based), and in terms of visibility/impact of our developed tool. Additionally, the meetings resulted in our writing a review article together & put in a grant application together (BBSRC+NSF joint bid). |
Impact | * joint review publication * source code * joint grant application |
Start Year | 2013 |
Description | Nescent: Building non-model species genome curation communities |
Organisation | Lawrence Berkeley National Laboratory |
Country | United States |
Sector | Public |
PI Contribution | We have contributed software, ideas, meeting time (3 one week workshops), writing time. |
Collaborator Contribution | The workshops were oriented around curriculum development & software need identification for genomics on emerging model organisms. The partners contributed code, writing, documentation & ideas. This was extremely productive and helped our project in terms of feedback, in terms of in-kind contributions (code), in terms of effort optimisation (the partners include developers of Apollo on which our work is based), and in terms of visibility/impact of our developed tool. Additionally, the meetings resulted in our writing a review article together & put in a grant application together (BBSRC+NSF joint bid). |
Impact | * joint review publication * source code * joint grant application |
Start Year | 2013 |
Description | Nescent: Building non-model species genome curation communities |
Organisation | National Science Foundation (NSF) |
Department | National Evolutionary Synthesis Center |
Country | United States |
Sector | Academic/University |
PI Contribution | We have contributed software, ideas, meeting time (3 one week workshops), writing time. |
Collaborator Contribution | The workshops were oriented around curriculum development & software need identification for genomics on emerging model organisms. The partners contributed code, writing, documentation & ideas. This was extremely productive and helped our project in terms of feedback, in terms of in-kind contributions (code), in terms of effort optimisation (the partners include developers of Apollo on which our work is based), and in terms of visibility/impact of our developed tool. Additionally, the meetings resulted in our writing a review article together & put in a grant application together (BBSRC+NSF joint bid). |
Impact | * joint review publication * source code * joint grant application |
Start Year | 2013 |
Title | Afra: Crowdsourcing genome annotation |
Description | As described elsewhere on researchfish - this tool aims to bring gene feature visualisation and improvement to a larger group of people... with two aims: 1. to improve gene predictions (and analyses that depend on them) and 2. to help educate contributors. The software is currently able to offer a complete approach for contributors who already have some basic biological knowledge. |
Type Of Technology | Software |
Year Produced | 2014 |
Open Source License? | Yes |
Impact | * deployed to students (improvements in learning experience) * new collaborations created (dundee, NESCent (US and Australia), TGAC) * creating better gene curations for ants |
URL | http://afra.sbcs.qmul.ac.uk |
Title | Bionode |
Description | Major challenges when doing bioinformatics work include eliminating redundancy and having to juggle heterogeneous technologies. To facilitate our work (specific aims of funded project) while creating an environment with broader impact, we started Bionode. Bionode provides pipeable UNIX command line tools and JavaScript APIs for bioinformatic analysis workflows. This means that a library written once is available in the command line, on client side (web app), on a high performance compute cluster. Furthermore, this software library is built using the Node.js technology, allowing it to take advantage of large amounts of work by people in internet startups. |
Type Of Technology | Software |
Year Produced | 2014 |
Open Source License? | Yes |
Impact | This project has attracted users and contributors from around the world (others are improving what we set up), while facilitating development & improving maintainability and robustness of the main funded project. |
URL | http://www.bionode.io |
Title | GeneValidator |
Description | Genomes of emerging model organisms are now being sequenced at very low cost. However, obtaining accurate gene predictions remains challenging. Even the best gene prediction algorithms make substantial errors, leading to further erroneous analysis. Therefore, many predicted genes need to be visually inspected and manually curated, a time consuming process. Here we propose GeneValidator, a tool to identify problematic gene predictions and to guide curation efforts. For each newly predicted protein-coding gene, GeneValidator finds similar sequences in databases of known genes and performs general gene-characteristic comparisons. The resulting report highlights differences between each putative protein-coding gene and similar genes from the database. This allows rapid identification of curation need and guides curators in performing their work. We thus expect GeneValidator to greatly accelerate and enhance the work of biocurators and researchers working with recently sequenced genomes. |
Type Of Technology | Software |
Year Produced | 2014 |
Open Source License? | Yes |
Impact | Publication is in prep. |
URL | https://github.com/monicadragan/GeneValidator/ |
Title | Sequenceserver |
Description | Makes it easier to perform BLAST |
Type Of Technology | Software |
Year Produced | 2012 |
Open Source License? | Yes |
Impact | (development has continued). |
URL | http://www.sequenceserver.com |
Company Name | Pragmatic Genomics |
Description | Pragmatic Genomics develops a range of data science software and provides data analysis, specialising in genomics data. |
Year Established | 2021 |
Impact | Customers in private (biotech, agroindustry), public and third sectors. |
Website | https://pragmaticgenomics.com/ |
Description | Passive web recruitment |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Simply by having our website up - even before the software was ready led to ~30 unknown people signing up to contribute curations. We don't know how they found us other than google. And we weren't able to cater to them as well as we would have wanted because our software platform was still too young. In any case this shows the potential of our crowd-sourcing approach to recruit participants via our online presence. In any case this shows the potential of our crowd-sourcing approach to recruit participants via our online presence. |
Year(s) Of Engagement Activity | 2014 |
URL | http://afra.sbcs.qmul.ac.uk |