ExODA: Integrating Description Logics and Database Technologies for Expressive Ontology-Based Data Access

Lead Research Organisation: Birkbeck College
Department Name: Computer Science and Information Systems

Abstract

Sources of semi-structured, overlapping, and semantically-related data on the Web are currently proliferating at a phenomenal rate, which has created a demand for more powerful and flexible information systems (ISs). This new generation of ISs will need to integrate incomplete and semi-structured information from heterogeneous sources, employ rich and flexible schemas, and answer queries by taking into account both knowledge and data.Ontology-based data access has recently been proposed as an architectural principle for such systems. The main idea is to develop a unified view of the data by describing the relevant domain in an ontology, which then provides the vocabulary used to ask queries. The IS can use ontological statements, such as the concept hierarchy, to derive new facts and thus enrich query answers with implicit knowledge. This idea has been incorporated into systems such as QuOnto, Owlgres, ROWLKit, and REQUIEM, and ontology reasoners such as RACER, FaCT++, Pellet, and HermiT.Such systems suffer from two main problems. First, the modelling capabilities of ontology languages are often insufficient for practical use cases. In order to achieve favourable computational properties, ontology languages are usually capable of describing only tree-shaped relationships; furthermore, (with some notable exceptions) they usually support only unary and binary predicates. Finally, ontology languages typically employ the open world assumption; however, when answering queries over large amounts of data, the closed world assumption (CWA) is often more appropriate.Second, query answering facilities in existing ontology-based ISs typically do not scale to data sets commonly encountered in practice. Up to now, approaches to addressing this problem have focused on reducing the expressivity of the ontology language even further in order to obtain formal tractability guarantees. This obviously exacerbates the first problem (restricted modelling capabilities), while not necessarily delivering robust scalability in practice.Database theory and practice can provide partial solutions to these problems. In databases, complex domains can be described using dependencies. Dependencies are used in a number of different ways: they are often used as integrity constraints--checks that verify whether a database instance includes all data specified in the domain description; however, dependencies can also be used similarly to ontologies to derive implicit knowledge. Treating dependencies as integrity constraints and answering queries under CWA has allowed practical relational database management systems (RDBMSs) to scale to very large data sets.Database techniques alone do not, however, satisfy all the requirements for an ontology-based IS. In particular, dependencies often cannot model arbitrarily large structures and thus do not cover all practical modelling use cases. Furthermore, generalising the query answering techniques used in practical RDBMSs to the case where information deriving dependencies must be taken into account is still an open problem.We therefore believe that the next generation of ontology-based ISs should be based on a synthesis and an extension of ontology and database systems and techniques, providing data handling capabilities similar to current RDBMSs, but with schemas that are rich, flexible, and tightly integrated with the data. In order to achieve this ambitions goal, however, a number of challenging fundamental problems must be solved. First, ontology and dependency languages need to be unified in a coherent theoretical framework. Second, it will be necessary to identify fragments of the framework that are likely to exhibit robust scalability but can still support realistic use cases. Third, it will be necessary to devise effective algorithmic techniques that can form the basis of practical ISs.

Publications

10 25 50
publication icon
Artale A (2010) Past and Future of DL-Lite in Twenty-Fourth {AAAI} Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010

publication icon
Artale A (2014) A Cookbook for Temporal Conceptual Data Modelling with Description Logics in ACM Transactions on Computational Logic

publication icon
Artale A (2013) Temporal Description Logic for Ontology-Based Data Access in 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, Beijing, China, August 3-9, 2013

publication icon
Gottlob G (2014) The price of query rewriting in ontology-based data access in Artificial Intelligence

publication icon
Kikot S (2012) Conjunctive Query Answering with OWL 2 QL in Thirteenth International Conference on Principles of Knowledge Representation and Reasoning, KR 2012, Rome, Italy, June 10-14, 2012

publication icon
Konev B (2011) Conjunctive Query Inseparability of OWL 2 QL TBoxes in Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, USA, August 7-11, 2011

 
Description First, we discovered that ontology-based data access (OBDA) with the W3C standard OWL 2 QL can be prohibitively expensive in the worst case. On the other hand, we demonstrated experimentally that, in real-world practice, OBDA is very efficient.
Exploitation Route The OBDA system Ontop is open access.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Culture, Heritage, Museums and Collections

URL http://ontop.inf.unibz.it