Ondex

Lead Research Organisation: University of Manchester
Department Name: Computer Science

Abstract

See lead application Rothamstead

Technical Summary

The current ONDEX system enables data from diverse biological data set to be linked, integrated and visualised through graph analysis techniques. It uses a semantically rich Core data structure based on graphs, has explicit support for workflow and has the ability to bring together information from structured databases and unstructured sources such as sequence data and free text. Extensions for Systems Biology include: Enhancing the ONDEX Core: - Methods to map data into the core data structures to exploit synteny and sequence similarity for applications needing comparative analysis of genetic and genomic organisation of multiple organisms. -Techniques for probabilistic interpretation of relations allowing uncertainty in the integrated data and in biological relationships to be modelled, combining relations using probabilistic models such as naive Bayesian and Bayesian graphical Gaussian approaches. Exploiting the ONDEX data graph: A graph structure analysis toolkit using, standard and advanced graph analysis algorithms, that traverses the data graph and modules representing common structural and functional components to be identified. Populating the ONDEX model: - Orchestrating data integration and analysis steps in ONDEX applications, using Taverna workflows and services (myGrid), including the running of workflows. Using Taverna will allow ONDEX to retain data on workflow provenance, which can be used to track, verify and validate data. - Enhanced text mining methods to extract and map terms from text in databases and online literature sources to detect synonymy and ambiguity and the identification and extraction biologically relevant relations. Exposing ONDEX to tools: New data access interfaces to allow ONDEX data to be used by third party tools, e.g. within workflows, and data export tools to provide easy access to ONDEX data for users of Cytoscape and for export in standard systems biology model exchange formats (e.g. SBML, BioPAX etc).

Publications

10 25 50
 
Title Ondex database 
Description database 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact Throughout the project a wide range of Ondex integrated datasets were generated for the application cases being developed at Rothamsted, Manchester and Newcastle. At the end of the project a total of 68k Ondex native datasets had been generated and were available from the project website. This was an important resource for some users as they didn't necessarily need to start from scratch if their application case was related to one we had prepared earlier. 
URL http://www.ondex.org/index.shtml
 
Title Ondex suite 
Description Graph based data integration software 
Type Of Technology Software 
Open Source License? Yes  
Impact The Ondex suite underwent major software development during the project. These included the development of new integration tools, a user friendly workflow designer, new visualisation methods improvements to the user interface, simplifications to the installer software and the development of training and technical documentation. These improvements benefited the users of Ondex whether involved in the SABR project or outside it. At the end of the project approximately 900 unique users had downloaded Ondex at least once, including Syngenta, Unilever etc, 
URL http://www.ondex.org/index.shtml
 
Title OndexView 
Description Visualisation software for Ondex's data integration graph 
Type Of Technology Software 
Year Produced 2010 
Open Source License? Yes  
Impact The OndexView software (see selected publication 6) provides a user-friendly interface to enable Cytoscape users to access very large Ondex knowledge networks. As described above this paper describes a novel approach to network data representation using semantic collapsing to increase the level of abstraction in networks and help a human and computational user abstract complexity. It was important to demonstrate that main features of Ondex (data integration and visualisation) were separable and that alternative visualisation tools (Cytoscape). The additional feature of OndexView is that it enabled user-specified mapping of subsets of an Ondex network to be displayed in Cytoscape which was essential for extremely large data integration projects where network was too large to be accommodated in the Ondex visualisation interface. 
URL http://www.ondex.org/index.shtml
 
Title Taverna 
Description Workflow developmnet and enactment software 
Type Of Technology Software 
Open Source License? Yes  
Impact The Taverna workflow system was extended to provide a comprehensive interface to Ondex data integration services. The Ondex suite was enabled with web services and tools for enacting Taverna workflows. Together, these two development activities provided bidirectional interoperability between Ondex and Taverna that was exploited extensively in the application cases developed at Manchester. 
URL http://www.taverna.org.uk/