quantMD: Ontology-Based Management for Many-Dimensional Quantitative Data

Lead Research Organisation: Birkbeck College
Department Name: Computer Science and Information Systems

Abstract

Ontology-based data management (OBDM) is a technology that has been developed over the past decade with the aim of facilitating access to various types of data sources. In general, ontologies provide a formal model and vocabulary for a domain of interest. In OBDM, the role of ontologies is threefold: to integrate distributed and heterogeneous data sources, enrich incomplete data with background knowledge, and provide a user-friendly language for querying.

To illustrate, in an energy company the traditional workflow for geologists to find answers to their information needs is to either execute pre-defined queries covering parts of the needs over their databases and then integrate the results manually, which is onerous and error-prone, or to ask the IT department to construct custom SQL queries, which may takes days or even weeks. OBDM reduces the time for finding answers to minutes by allowing the geologists to formulate their queries in natural-language terms and then run these queries via the OBDM tools over their databases.

Thus, by bringing together knowledge representation and database technologies, OBDM has the potential to transform information systems by allowing domain experts to query complex and distributed data efficiently without the help of database professionals.

This project addresses the main bottleneck in the way to realise this potential: so far, OBDM has been developed primarily for access to purely qualitative and one-dimensional data, but nowadays data is mostly numerical, many-dimensional, often temporal, and user information needs usually involve quantitative analysis. Thus, quantitative queries such as "find all UK-sponsored research institutions in Europe whose total triennial financial contributions from UK-based private companies exceeds euro 10M" are not supported at all by existing OBDM tools. Moreover, because of the so-called open world assumption made in OBDM, developing the theory and practical tools for dealing with such queries is extremely challenging.

The aim of this project is to develop a novel OBDM framework for querying and analysing many-dimensional numerical data. To address the challenges, we bring together techniques from databases, knowledge representation, and formal methods, in particular temporal and modal logics, and develop these further. We will develop a theoretical framework for querying such data, develop tools for using this framework in practice, and test our tools with partners from industry and the public sector.
 
Description We have made significant progress towards our goal of extending standard static ontology-based data access to data with a temporal dimension. We started by developing a framework for querying to one-dimensional temporal data that can represent the temporal evolution of a single object. We take into account brackground knowledge formulated in an ontology. We proposed to use ontologies given in linear temporal logic, LTL, which has been invented in philosophy and has been successfully applied in computer science in the area of program verification. Queries are also given in the positive fragment of LTL. Within this framework, we investigated the complexity and rewritability to standard relational queries of ontology-mediated queries. By taking account of the expressivity of the temporal operators used in the ontology and the shape of the queries, we identified a hierarchy of more and more powerful ontology-mediated queries and proved rewritability into either standard database queries, such queries extended by standard arithmetic predicates, and further extensions with primitive recursion.
We have thus laid the foundation for practical ontology-based access to one-dimensional temporal data.

In a second step we extended our framework for querying one-dimensional temporal data to querying two-dimensional data in which each timestamp comes with a database base of facts true at that timestamp. We propose to model the second dimension using the description logic underpinning the OWL profile for static one-dimensional data access. Within this framework we prove powerful transfer results that lift our complexity and rewritability results from the one-dimensional to the two-dimensional case. Using these transfer results we obtain again a hierarchy of more and more powerful two-dimensional ontology-mediated queries which combine fragments of LTL with description logic.
Exploitation Route Our findings can be used in virtual knowledge graphs systems.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description University of Oslo 
Organisation University of Oslo
Country Norway 
Sector Academic/University 
PI Contribution Developing the ontology-based data access system Ontop
Collaborator Contribution Developing the ontology-based data access system Ontop
Impact Developing the ontology-based data access system Ontop
Start Year 2015