Argudas - argumentation-based data sharing across gene expression databases

Lead Research Organisation: Heriot-Watt University
Department Name: S of Mathematical and Computer Sciences

Abstract

Modern biological and medical science is producing large amounts of experimental data. Much of this data is held in public databases and accessible via the Internet to researchers across the world. The content of these databases does partially overlap. Due to the variation of experimental details and conditions, and sometimes simply due to human error during the handling of this data, it is now quite common to find some inconsistencies between these databases. For a researcher making use of these databases it poses the question of which of the data sources to trust and how to determine who is right and who is wrong. As part of this project we propose to use so-called argumentation technology to deal with this problem. Just as humans present arguments in favour and against a particular statement and how a discussion can move forward and backward between two people, computers can carry out similar reasoning if they are provided with the underlying knowledge to do so. In this project, we therefore propose to develop a tool that can use data from such public databases and domain expert knowledge to argue over the inconsistencies within and between these biomedical databases. The output of our tool will show the biologists what statements are supported by which databases and how they relate to each other. This will allow the biologists to come to an informed decision with respect to which data he or she can have most confidence in. The proposed work is focusing on a specific set of such databases which hold gene-expression data, i.e. data that captures which genes are active in which parts of an animal or human.

Technical Summary

Due to experimental variations, differences in result interpretation and occasional human error during the execution of experiments and the handling of the data they produce, it is now a common fact that there are inconsistencies within and across biomedical databases, and that there is currently no systematic, computational support for end users to deal with these inconsistencies. The primary objective of this project is the development and deployment of a web-based argumentation tool that allows end users to explore differences in a number of key gene expression databases, covering in-situ and microarray gene expression and, possibly, SAGE data. Although the work is focusing on gene expression, general lessons will be learnt, preparing the possible introduction of argumentation-based data sharing in other biomedical application domains. Argumentation, a form of computer-based reasoning where arguments in favour and against a particular statement are considered, has been shown to work in medical situations, where it can be used to provide both decision support to medics and explanations of those decisions to patients. Although supported by a clear theoretical basis, argumentation follows a natural form of reasoning that is easily accessible and, hence, more readily acceptable by scientists without training in formal logic and mathematics. Our previous work has shown that this form of reasoning is also suitable to address inconsistencies and incompleteness within and across biomedical data resources. While our previous work has established much of the technical groundwork, the purpose of this project is to take argumentation technology from the laboratory of an informatics group to the biologist end user. To this end we particularly emphasise the evaluation with end users and the focus on two use cases at the MRC Human Genetics Unit and the European Bioinformatics Institute. The completed Argudas system will be hosted at the MRC after this project completes.

Publications

10 25 50
publication icon
Andrews S (2013) Gene Co-Expression in Mouse Embryo Tissues in International Journal of Intelligent Information Technologies

publication icon
Baldock R (2012) Advances in Systems Biology

publication icon
Hayamizu TF (2013) EMAP/EMAPA ontology of mouse developmental anatomy: 2013 update. in Journal of biomedical semantics

 
Description Argudas was one of the very first explorations of Argumentation Theory (the study of arguments and arguing) conducted in the biological rather than medical domain. Argudas examined the use of argumentation to help an end user resolve inconsistencies and incompleteness within a range of gene expression databases for the mouse. In addition to generating arguments, Argudas explored the presentation of arguments and the appropriateness of applying metaphor of 'arguing' to this use case.

This culminated in a series of research papers and an online tool that could be used to survey, and explore, the contents of the included resources. By asking about a single gene or anatomical structure the user was able to view all the gene expression information from all four resources featured within Argudas. When the resources contained an inconsistency, the user was able to drill down to see a range of potential reasons for this discord. In this way Argudas helped the end user make an informed decision as to which resource was correct. 6 years later this tool is no longer available due to a lack of resources for maintenance.

Argudas relied upon an external framework to generate arguments and tested a range of mechanisms for presenting those arguments to the user. It was clearly demonstrated that the ideal presentation was a matter of personal preference. However, the concept of communicating information to the biological end user via the metaphor of arguments was effective.
Exploitation Route Several of the aims and ideas behind Argudas fed into the EU FP7 project CUBIST (FP7-ICT-2009-5), which investigated the use of Semantic Technology in a relationship focused Business Intelligence Tool. Although CUBIST was not argument-centric it shared several initial aims with Argudas. Like Argudas, CUBIST enabled the inference of possible new (i.e., missing) information and explored a range of techniques for visualising the information within the domain.
Sectors Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description EU FP7
Amount € 295,000 (EUR)
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 10/2010 
End 09/2013
 
Description EU FP7
Amount € 29,000 (EUR)
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 02/2010 
End 09/2011
 
Description MRC HGU Edinburgh Mouse Atlas Project 
Organisation Medical Research Council (MRC)
Department MRC Human Genetics Unit
Country United Kingdom 
Sector Academic/University 
PI Contribution Provided biomedical informatics expertise.
Collaborator Contribution Provided biomedical use case application.
Impact Publications and further grants (details submitted in other sections).
 
Title Argudas 
Description To identify and resolve inconsistencies between selected gene expression databases. 
Type Of Technology Webtool/Application 
Year Produced 2010 
Impact Lead to further funding as described in other sections. 
URL http://www.macs.hw.ac.uk/bisel/argudas.shtml
 
Title mousePy 
Description Visual exploration of gene expression within the context of an anatomy ontology. 
Type Of Technology Webtool/Application 
Year Produced 2013 
Impact Prototype demonstrator. Not used outside project consortium yet. 
URL http://mousepy.info/static/