iTract: Islands of Tractability in Ontology-Based Data Access

Lead Research Organisation: Birkbeck College
Department Name: Computer Science and Information Systems

Abstract

15 years ago most data was structured, complete, and neatly organised in databases. This is no longer the case. Unstructured, incomplete, and heterogeneous data sets are proliferating at an enormous rate. This is most evident in the context of the World Wide Web, but also applies to scientific data, data in business and industry, data in healthcare and in many other areas. To make use of such data, traditional information systems based on standard database technologies are no longer sufficient.

Ontology-based data access and management is a novel approach to address this challenge by introducing a semantic layer (ontology) that provides the user with a high-level unified view of the data as well as a vocabulary to access and query the data. Ontologies model application domains by providing machine readable definitions of terms and relationships between them. They are already used in numerous applications, for example, by the NHS: to enable communication between health professionals within the United Kingdom and worldwide, it is crucial that they use the same terminology; such a terminology is provided by the ontology SNOMED CT.

Using ontologies to access data and thereby directly combining data and knowledge is a novel idea of the 21st century. First applications have demonstrated that ontology-based data access and management is indeed feasible and has the potential to revolutionise modern information systems. However, scalability of query answering with expressive ontology languages remains a big challenge, and it is the aim of this project to develop a new "island of tractability" approach to tackle it. Our approach links ontology-based data access with two well-established and successful areas of Computer Science: constraint satisfaction and Boolean circuit complexity. We aim to transfer proof methods, techniques, and methodologies from these two areas to ontology-based data access. This includes a non-uniform complexity analysis, where we aim to classify the complexity of answering ontology-mediated queries, which consist of an ontology and a standard database query. Based on this complexity analysis, we will develop uniformly efficient query answering algorithms for the identified islands of tractable ontology-mediated queries, and implement them in the ontology-based data access systems Ontop and Combo. We will apply our novel technology to case studies from oil and gas industry and healthcare.

Publications

10 25 50
publication icon
Artale A. (2015) First-order rewritability of temporal ontology-mediated queries in IJCAI International Joint Conference on Artificial Intelligence

publication icon
Artale A. (2015) Tractable interval temporal propositional and description logics in Proceedings of the National Conference on Artificial Intelligence

publication icon
Botoeva E (2016) Games for query inseparability of description logic knowledge bases in Artificial Intelligence

publication icon
Botoeva E (2019) Query inseparability for ALC ontologies in Artificial Intelligence

publication icon
Botoeva E. (2016) Query-based entailment and inseparability for ALC ontologies in IJCAI International Joint Conference on Artificial Intelligence

publication icon
Brandt S (2018) Querying Log Data with Metric Temporal Logic in Journal of Artificial Intelligence Research

publication icon
Bresolin D (2017) Horn Fragments of the Halpern-Shoham Interval Temporal Logic in ACM Transactions on Computational Logic

 
Description We have discovered and investigated many new and important islands of tractability in ontology-mediated query answering, including the following:

(1) We have considered the case of ontology-mediated querying with expressive data types such as the integers, the rational numbers, or related spatial and temporal data types. Using recent results on P/NP dichotomies for temporal constraint satisfaction problems, we obtained P/coNP dichotomies for ontology-mediated querying with datatypes. Moreover, in many cases, membership to the tractable class is decidable. Sometimes this can even be done using a straightforward syntactic check. This work was published, for example, in AAAI 2017.

(2) We considered ontologies over the guarded fragment of first-order logic and determined very expressive fragments for which there exists a P/NP dichotomy for ontology-mediated query answering. In many practically relevant cases we obtained NExpTime or ExpTime decision procedures for deciding whether an ontology-mediated query is tractable, thus identifying important relevant classes of queries for which PTime querying is possible. The also proved dichotomies between datalog-rewritable and coNP-hard. This works received the Best Paper Award at PODS 2017.

We investigated the relationship between ontology-mediated query answering using unions of conjunctive queries and ontology-mediated query answering using SPARQL queries. We developed criteria and decision procedures when the former can be reduced to the later type of queries. This research is practically relevant as many implemented systems are based on SPARQL queries. Work on this received the Distinguished Paper Award at IJCAI 2018.

We investigated the question whether all tractable ontology-mediated queries can be rewritten into queries based on Horn ontologies, presenting both positive and negative results. We also gave decision procedures for containment and first-order rewritability of ontology-mediated queries over Horn ontologies. This work was presented at IJCAI 2016 and IJCAI 2018.

(3) We gave solutions to two fundamental computational problems in ontology-based data access with the W3C standard ontology language OWL2QL: the succinctness problem for first-order rewritings of ontology-mediated queries, and the complexity problem for ontology-mediated query answering. We classified ontology-mediated queries according to the shape of their conjunctive queries (treewidth, the number of leaves) and the existential depth of their ontologies. For each of these classes, we determined the combined complexity of ontology-mediated query answering, and whether all ontology-mediated queries in the class have polynomial-size first-order, positive existential, and nonrecursive data- log rewritings. We obtain the succinctness results using hypergraph programs, a new computational model for Boolean functions, which makes it possible to connect the size of ontology-mediated query rewritings and circuit complexity. This work was published in LICS 2014, 2015 and the Journal of ACM 2018.

We extended this analysis to ontology-mediated queries with sets of linear tgds and conjunctive queries of bounded hypertree width. We also investigated parameterised complexity of answering tree-shaped ontology-mediated queries in OWL 2 QL under various restrictions on their ontologies and conjunctive queries. In particular, we construct an ontology T such that answering ontology-mediated queries (T,q) with tree-shaped query q is W[1]-hard if the number of leaves in q is regarded as the parameter. The number of leaves has previously been identified as an important characteristic of conjunctive queries as bounding it leads to tractable ontology-mediated query answering. This work was presented at PODS 2017.

(4) We have investigated islands of tractability in temporal ontology-based data access with the linear temporal logic LTL, Halpern-Shoham interval temporal logic HS and metric temporal logic MTL. This work was published in JAIR 2018, ACM TOCL 2017 and presented at IJCAI 2016, AAAI 2017, TIME 2017.
Exploitation Route We expect that our findings will be used in ontology-based query answering settings in both academia and industry.
Sectors Digital/Communication/Information Technologies (including Software),Education,Energy,Healthcare,Culture, Heritage, Museums and Collections,Other

 
Description Our query rewriting algorithms for OWL 2 QL have been implemented in the (open-source) ontology-based data access platform Ontop https://ontop.inf.unibz.it. We have collaborated with the Free University of Bozen-Bolzano on development of the Ontop platform for the past 7 years (ISWC 2013, ISWC 2014, Semantic Web 2017, ISWC 2018). Ontop is the query transformation engine for the EU Optique project and has been extensively evaluated at the industrial partners, Equinor and Siemens. SIRIS Acedemic use the Ontop engine for their data integration tool. The Ontop platform is also at the core of the BT Hypercat Data Hub and in the DALI project at IBM Ireland. The Ontop plugin for the de-facto standard ontology editor Protege has been downloaded more than 13K times since 2015.
First Year Of Impact 2017
Sector Digital/Communication/Information Technologies (including Software),Education,Energy,Other
Impact Types Economic

 
Description University of Oslo 
Organisation University of Oslo
Country Norway 
Sector Academic/University 
PI Contribution Developing the ontology-based data access system Ontop
Collaborator Contribution Developing the ontology-based data access system Ontop
Impact Developing the ontology-based data access system Ontop
Start Year 2015