Logical Difference for Ontology Versioning

Lead Research Organisation: University of Liverpool
Department Name: Computer Science

Abstract

Ontologies are used to provide a common vocabulary for a domain of interest together with descriptions of the meaning of terms built from the vocabulary and relationships between them. A wide range of information technologies in areas such as medical informatics, bio-informatics, and the semantic web and grid are now dependent on ontologies to capture domain semantics and promote interoperability. A major example is the NHS Connecting for Health Programme with the strategic aim of adopting the healthcare ontology SNOMED CT across all NHS clinical systems.Modern applications often require large and complex ontologies (sometimes, containing more than 400,000 different term definitions), and, in practice, ontologies constantly evolve. They are regularly being extended, updated, corrected, and refined. Engineering, maintaining, and using such ontologies is a highly complex and laborious task, which is practically infeasible without appropriate tool support. Among the tools required, automated support for ontology versioning is one of the most critical and challenging.The aim of this research proposal is to develop a novel approach to ontology versioning. Most current ontology editors and ontology management systems, such as Protege, SWOOP, OBO-Edit, and OntoView, support ontology versioning either natively or through plugins. Though helpful, it is generally agreed that current support is unsatisfactory: it lacks unambiguous semantic foundation, it is syntax dependent, and it does not capture the non-local implications of differences between ontology versions. Moreover, it cannot take into account the fact that often one and the same ontology is used in different applications, which may require different comparison techniques. This research proposal aims at developing a logic-based approach to representing the difference between ontologies as a basis for ontology versioning. Under this view, ontologies provide answers to queries about some vocabulary of interest with the help of reasoning services. If two versions of an ontology give the same answers to a class of queries relevant to an application domain, they may be deemed to have no difference regardless of their syntactic or structural form; and queries producing different answers from the versions may be considered as the characterisation of the difference itself. The main advantage of the logical diff over known differencing techniques is that it is determined by the logical semantics of the ontology and query language and does not depend on their syntactic form. Moreover, this approach is of greater flexibility. By choosing an appropriate query language and vocabulary, one can detect exactly the differences visible when querying instance data, exactly the differences expressed by subsumptions between concepts, or even exactly the differences expressed in very expressive languages such as first-order logic.To achieve these aims, a number of challenging theoretical and practical problems need to be solved. In particular:- Novel algorithmic approaches are required to detect whether two versions of a logical theory are logically different.- Novel techniques to succinctly characterise the logic-based difference between ontology versions are required.- A mechanism tracing the differences in logical consequences back to axioms of the ontologies is required.- Logical meta-properties of distinct notions of logical difference have to be understood.Solutions to these research challenges are of great interest not only for ontology versioning but also for other areas of logic,knowledge representation, and automated reasoning such as ontology debugging, theory update, and theory modularisation.

Planned Impact

The main indirect and long-term beneficiaries from our investigation of logic-based ontology versioning are users and developers of information technologies that adopt ontologies. Of particular relevance are bio-informatics, medical informatics and healthcare, where ontologies are used in a wide range of applications to capture domain semantics and promote interoperability. A good example illustrating the great importance of ontologies in healthcare is SNOMED CT, the Systematized Nomenclature of Medicine Clinical Terms. SNOMED CT is a large-scale description logic ontology owned by the International Health Terminology Standards Development Organisation (IHSTDO), a not-for-profit organisation. Its importance in the United Kingdom is witnessed by the fact that it is one of the main strategic goals of the NHS Connecting for Health IT Programme to adopt SNOMED CT across all NHS clinical systems. Internationally, it has been adopted in a wide range of countries including the United States and Australia and translations into German and Spanish have been released. SNOMED CT and the wide range of other large-scale ontologies used in areas such as healthcare are extremely difficult and costly to develop and maintain. Specifically, it is generally accepted that current versioning support is unsatisfactory and insufficient, and that without novel support tools the increasing size and complexity of ontologies will lead to them rapidly becoming unmaintainable. The results of this project on logic-based ontology versioning will potentially lead to ontologies of much higher quality that are significantly less costly to develop, maintain, and employ. The main direct beneficiaries from this research are developers of ontology management systems and editors and ontology developers. Standard dissemination activities such as publication of journal papers, presentation of conference papers, and tutorials will ensure that developers of ontology management systems and reasoners as well as ontology developers benefit from our research. In addition, however, we will also collaborate with a number of highly important ontology developers and ontology tool developers directly: (i) we will collaborate with the UK Terminology Centre (UKTC) that is charged with maintaining and developing SNOMED CT within the NHS Connecting for Health Programme. UKTC will help to test and evaluate the versioning algorithms developed in this research project. This collaboration will ensure that the UKTC can directly benefit from any research outcomes of this project. In this way we ensure that the NHS, and therefore the wider public in the United Kingdom, directly benefit from our research results. (ii) we will collaborate with Clark&Parsia, a company specialising in semantic technologies, to ensure that ontology management tools and reasoning systems developed by this company and their users benefit from our research. As a variety of software products developed by Clark&Parsia are open source (e.g., the well known OWL reasoners Pellet) this will ensure dissemination among the wider community of developers of ontology development and reasoning tools. (iii) we will collaborate with the developers of HeTS, a widely used ontology management system developed at the DFKI Bremen and the University of Bremen. This will ensure that the versioning algorithms developed in our research project are available to the HeTS users and so disseminated among its developers and users.

Publications

10 25 50
publication icon
Boris Konev (Author) (2011) Conjunctive Query Inseparability of OWL 2 QL TBoxes in Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence

publication icon
Konev B (2012) The Logical Difference for the Lightweight Description Logic EL in Journal of Artificial Intelligence Research

publication icon
William Gatens (2013) Module Extraction for Acyclic Ontologies in Proceedings of the 7th International Workshop on Modular Ontologies

 
Description Direct outcomes of this research project include
- A study of the complexity of query inseparability for lightweight description logic DL-Lite, practical algorithms for deciding query inseparability for this logic, and their application to ontology versioning and modularisation;

- A novel approach to computing logical difference for ELHr unfordable TBoxes;

- A new software tool for computing logical difference.
Exploitation Route Logic-based approach to computing the difference between ontology versions is now a standard definition used in ontology research, specifically, but groups in Manchester and Oxford.
Sectors Healthcare

 
Title CEX2.5 
Description CEX2.5 is a tool for computing three types of logical differences between two acyclic ELHr terminologies, i.e. EL terminologies with additional domain restrictions, range restrictions, and (simple) role inclusions. The types of differences that can be analysed are differences w.r.t. concept inclusions, answers to instance queries, and answers to conjunctive queries formulated over a specified signature, which are logically entailed by a given terminology T1 but not by a second terminology T2. CEX2.5 uses the reasoner CB internally. The theoretical background behind CEX is described in the paper The Logical Diff for the Lightweight Description Logic EL by Boris Konev, Michel Ludwig, Dirk Walther and Frank Wolter. 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact This tool led to two papers being produced 
URL http://lat.inf.tu-dresden.de/~michel/software/cex2/