Structured and graphical queries for Drosophila neuroscience data

Lead Research Organisation: University of Cambridge
Department Name: Genetics

Abstract

Disorders of the nervous system account for the single biggest cost to the National Health Service and affect one in three people in the developed world at some point in their life. Designing treatment therapies requires us to understand first how the brain works yet it is the most complex organ known and thus simpler models are essential. The brain of the fruit fly, Drosophila melanogaster, provides an excellent model system for studying how brains function. It is orders of magnitude smaller and simpler than a mammalian brain, yet genetically it is remarkably similar. Moreover, like mammalian brains, is capable of learning and is remodelled in response to experience and environmental context. There is a large history of research into the brain of Drosophila and other insects. This gives a firm foundation to modern studies of the genetic basis of how the Drosophila brain is built and functions. Such studies take advantage of an increasingly powerful array of genetic techniques that allow specific regions, cells and genes to be disrupted thus measuring their function. At the same time, increasingly sophisticated imaging techniques are revealing the structure of the Drosophila brain in ever-finer detail. The sheer volume and microscopic detail of the data being collected poses a problem to researchers wanting to build and communicate coherent models of brain function or to share the tools they use for their experiments. Navigating through the blizzard of new information is made particularly difficult by the varying and often confusing nomenclature that is an inevitable feature of a complicated field with such a long history. We aim to remedy this by building a web-based atlas and search tool - Virtual Fly Brain. Users will be able to navigate by clicking on labelled regions in a 3D reference image of a brain, or by searching and browsing a structured vocabulary which names brain regions and the brain cells which connect them. Users will be able to highlight brain regions in various colours by choosing terms in the vocabulary they find through browsing and searching. Choosing a term will also prompt the display of various information related to that term: links to additional images; written definitions with references to the scientific papers they come from; synonyms and comments to help disambiguate confusing or conflicting usage of terms. Users will be able to use the lists of terms generated by these queries to search for related data stored in FlyBase, the main genetic database of the Drosophila community. This will allow them to find genes expressed in structures on the list or which are known to be involved in the construction or function of these structures. It will also allow them to search for sophisticated genetic reagents which target these structures. Finally, we will provide tools to help new researchers and students to explore and learn how the brain is organised and allow expert users to label their own data using our structured vocabulary and for.

Technical Summary

The amount and complexity of data being produced on the structure, development and connectivity of the Drosophila brain means that biologists will increasingly need tools to help them search, integrate and synthesise this data into models of how the brain works. The neuroanatomical working group is about to agree upon a new set of defined terms and boundaries for the major neuropil of the adult fly brain and presumably will follow with the tract systems and developmental histories. This is an essential first step but these conclusions need to be integrated into a formal ontology that extends the existing whole body/organism ontology currently embedded in FlyBase. Further, the ontology needs to be augmented with the development of appropriate data structures, query tools and visualisation software. Hence, we do not propose a new atlas or database of expression patterns (these already exist, or will be developed by others), rather a means to integrate any such resource with a centralised query system. We will also have to extend the FlyBase anatomy ontology to support neural specific features (particularly connectivity information). This will allow new more complex queries to be developed returning data on information flow in the CNS. The centralised query system will also link terms to phenotypic and gene information within FlyBase as well as images from external databases or atlases that use the ontology terms agreed by the nomenclature working group. We will develop a series of tools for building and visualising queries based on neuroanatomical terms. All software/tools produced will be open-source so that they can be re-used or extended by the community. As part of the study we will also revise and extend the annotation of Drosophila neuroscience studies indexed in FlyBase which is currently lagging behind other areas. Joint with BB/G02274X/1.

Publications

10 25 50
publication icon
Cantarelli M (2018) Geppetto: a reusable modular open platform for exploring neuroscience data and models in Philosophical Transactions of the Royal Society B: Biological Sciences

publication icon
Costa M (2013) The Drosophila anatomy ontology. in Journal of biomedical semantics

publication icon
Milyaev N (2012) The Virtual Fly Brain browser and query interface. in Bioinformatics (Oxford, England)

publication icon
Osumi-Sutherland D (2012) A strategy for building neuroanatomy ontologies. in Bioinformatics (Oxford, England)

 
Description The Virtual Fly Brain web resource, generated by this project, makes it easy for researchers to search and query across information from hundreds of papers and 10s of thousands of database entries and images to find key data to help them formulate hypotheses and to find the reagents they need to do their experiments. The standard vocabulary that lies at the core of this resource also provides a means for users to mark up their own data in a way that allows them to be easily integrated with data on our site. The new methods for finding groups of neurons that with similar shape and location, developed for this project and available on our site, provide a means for users to make sense of the vast amount of image data now becoming available in this field.



We first publicly released VFB in March 2011. Its user base has increased as we have added functionality and improved the user interface, with over 2700 pageviews in November 2012 and rising.



The data on VFB include:



(a) Implementation of BrainName standard nomenclature (Neuron, under revision) with terms linked to painted domains in a standard 3D brain image. Users can browse virtual sections of this brain and select regions by pointing and clicking. Selecting a region prompts display of descriptions and references. Selected regions can be queried for innervating tracts and neurons, for expression of genes, transgenes, and for phenotypes.



(b) A near comprehensive catalogue of published neuron classes for the adult brain (590 classes from 119 papers). Neuron classes can be found by autocomplete searching or by queries of neuropil regions or tracts. All classes have a text description, references, synonyms, and queries for expression of genes, transgenes and phenotypes. We provide images where available (e.g. from FlyCircuit [Chiang 2010 Curr Biol]), with links to their source.



(c) Extensive, queryable expression patterns and phenotypes for the adult brain: 9006 expression assertions for 3139 transgenes, 1270 assertions for 299 genes and 4512 phenotype assertions. Following user feedback we have devoted significant effort to documenting transgenes for targeted manipulation of specific neurons. VFB has a near comprehensive set of published transgene expression patterns for adult brain, several thousand transgene expression patterns from BrainTrap [Knowles-Barley 2011, Database] and HHMI Janelia Farm Flylight [Jenett 2012 Cell Rep], including images. All query results show links to FlyBase and their source data or reference.



(d) Over 16000 single-neuron images from FlyCircuit, registered to our standard brain and clustered by similarity, measured using a tool (N-Blast) developed by G. Jefferis. Some clusters correspond to known classes, while others predict new ones. The clusters can be found via queries for the brain regions they overlap and are viewable as rotatable 3D images.



2. An ontology of Drosophila neuroanatomy covering all stages and regions, for use in annotating images, expression, phenotypes and more. All data annotated with it can be rapidly integrated into VFB and FlyBase - as we have done with HHMI Janelia Farm and BrainTrap data. The ontology consists of 3680 terms and 5843 logical axioms, including 2361 that record neuronal specific properties, based on 325 references. The logical structure of the ontology in combination with standard OWL reasoning software drives queries on the VFB site. More sophisticated queries are available when the ontology is used directly.



3. Expression curation includes gene and transgene expression for all stages and regions of the Drosophila nervous system. Much of this is not yet available on VFB, but all is available on FlyBase. We have curated: 8897 expression assertions about 923 transgenes and 556 genes from 393 papers; 5575 expression assertions about 2242 transgenes from 2 databases (BrainTrap and Flylight).



4. A standard system for representing neuroanatomy in queryable form using web ontology language. This is intended to be a general system applicable to any nervous system. It is already in active use by other ontologies including the Cell Ontology (http://cellontology.org).



5. User toolkit:

We released an open-source neuron tracing tool in 2011 (Longair, 2011 Bioinformatics). We have a prototype brain annotation tool in alpha-testing that allows users to register their own brains to our standard and annotate with ontology terms.
Exploitation Route There has been some interest in VFB reference brains for quantitative comparisons with brain images from neurodegeneration-affected, aged and matched controls for anatomical comparison VFB is itself now a public database, hosting data deposited and curated as part of this project. This includes thousands of expression patterns; referenced descriptions of hundreds of neuron classes and brain regions and queryable assertions of the relationships between them; over 16000 single-neuron images grouped into over 600 clusters on the basis of their morphological similarity.



The ontology built during this project is openly available in OBO and OWL formats via the OBO foundry and searchable via multiple ontology portals (e.g. bioportal, ontology lookup service). OWL versions of the ontology and every term within it can also be accessed and referenced using persistent URLs that are guaranteed to be stable and resolve to a web page when accessed with a browser, but to rdf-xml when accessed programmatically. All terms have also been deposited in NeuroLex (http://neurolex.org/) where they are available in wiki form for user editing.



All expression curation generated during this project has been deposited in the FlyBase CHADO database. FlyBase keep this dataset in-sync with the molecular data and genetic feature names in FlyBase. All curated data is visible on FlyBase, who also provide a publicly accessible SQL server for programmatic access.



Annotation of single neuron images in OWL can be downloaded from our sourceforge site (https://sourceforge.net/projects/virtualflybrain/). Combined with the anatomy ontology, this forms a queryable structure that allows more sophisticated queries than the VFB site.



All code from the project is openly available on SourceForge (https://sourceforge.net/projects/virtualflybrain/).
Sectors Pharmaceuticals and Medical Biotechnology

URL http://www.virtualflybrain.org/
 
Description In 2018, our site reached 9500 users (up 10% on 2017) and 18,000 sessions (up 20% on 2017) - data from Google Analytics - and an increase of several fold since the end of our BBSRC grant. Approximately 50% of users are in the USA, 10% in the UK, the remainder from 90 other countries.
First Year Of Impact 2012
 
Description Biomedical Resources Grant
Amount £996,004 (GBP)
Funding ID 208379/Z/17/Z 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2017 
End 09/2021
 
Description Isaac Newton Trust Research Grant Scheme
Amount £19,265 (GBP)
Organisation University of Cambridge 
Department Isaac Newton Trust
Sector Academic/University
Country United Kingdom
Start 10/2012 
End 09/2013
 
Description Wellcome Trust Biomedical Resource Grant
Amount £463,000 (GBP)
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2014 
End 09/2017
 
Title VFB 1.5 
Description Virtual Fly Brain: a hub for Drosophila melanogaster neural anatomy and imaging data (new Release 1.5) 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
Impact Page views per month have increased from ~7000 before release of this update (Jan 2016), to ~9500 (Feb 2017). 
URL http://www.virtualflybrain.org
 
Title Virtual Fly Brain 
Description Drosophila allows unprecedented genetic dissection of a relatively small nervous system that supports complex behaviours. Exploiting this opportunity requires intuitive tools to help users query the expanding knowledge from the literature and bulk image data, in order to generate hypotheses about neural circuits, and find reagents that target the neurons concerned. Virtual Fly Brain (VFB) is the only resource that provides this functionality, showing steadily increasing usage. It helps researchers to map fruitfly brain circuitry and dissect its function. It includes an encyclopedic catalogue of neurons, and tens of thousands of 3D images, all morphed on a template brain. This allows users to view brain cells from multiple experiments together as if part of the same brain. VFB acts as an integration hub for many sources of data, including 3D image data. It is the only global resource that does this. 
Type Of Material Database/Collection of data 
Year Produced 2011 
Provided To Others? Yes  
Impact The VFB website has become a hub for the research user community as we hoped. By the end of the grant in September 2012, usage per month had reached some 500 users, 700 sessions, and 2000 page views. By the final month before a major upgrade of the interface (January 2016), monthly usage had reached some 1100 users, 1300 sessions, and 7000 page views (Source: Google Analytics). Further development of the site has again been funded by Wellcome Trust since October 2014. 
URL http://www.virtualflybrain.org
 
Title Virtual Fly Brain 2.1 
Description A totally new user interface and tools for Virtual Fly Brain: a hub for fly (Drosophila melanogaster) neural anatomy, connectivity & imaging data. An interactive tool for neurobiologists to explore the detailed neuroanatomy, gene expression, and associated phenotypes of the adult Drosophila melanogaster brain. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact 72 citations for the original paper describing the resource, Milyaev et al (2012) Bioinformatics, Volume 28, Issue 3, 1 February 2012, Pages 411-415, https://doi.org/10.1093/bioinformatics/btr677. 226 hits in Google Scholar - will retrieve scholarly works that link to the website but do not cite the paper. 
URL https://v2.virtualflybrain.org/
 
Description VFB LMB 
Organisation Medical Research Council (MRC)
Department MRC Laboratory of Molecular Biology (LMB)
Country United Kingdom 
Sector Academic/University 
PI Contribution Access to VFB site for new data and analysis
Collaborator Contribution Provision of new analysis for VFB - principally automated classification of some 16000 single neuron images into around 1000 clusters of probably identical neurons, that were then integrated into the VFB website
Impact Addition of searchable links and images on 16000 neurons, integrated into an ontology, to the publicly accessible Virtual Fly Brain site (www.virtualflybrain.org)
Start Year 2012