XML with Incomplete Information: Representation, Querying, and Applications

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Informatics

Abstract

Data on the Web - particularly XML data - is often incomplete andinconsistent, due to such factors as the lack of centralisation andcontrol over data quality. While the transfer and extension ofrelational tools to deal with XML data has been a central theme indata management research over the past decade, the standard databasetoolbox offers us little in terms of handling ofincompleteness. Indeed, it is one of the most notoriouslyunderdeveloped and most often criticised aspects of relationaldatabases. In addition, the flexibility of XML leads to many ways inwhich incompleteness of data can be accommodated, in addition to thestandard relational null values.There has not yet been any detailed study of incompleteness in XML.Our main goal is to conduct such a systematic study, and develop itsapplications in the area that underlies data management tasks on theWeb -- the use of data across multiple independent applications.We shall investigate models of XML with incomplete information andalgorithmic techniques for querying such data, paying particularattention to the correctness/complexity tradeoffs and to the practicalityof algorithmic tools. We shall investigate the fundamental role ofincompleteness in applications that involve the movement of XML data,such as integration of data from various sources or moving databetween peers according to mappings between their schemas. We shalldevelop a specification and algorithmic toolbox for dealing withincomplete information as it arises in such applications.

Funded Value:

£565,505

Funded Period:

Sep 09 - Nov 13

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/G049165/1

Principal Investigator:

Leonid Libkin

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (50%)

Information & Knowledge Mgmt (50%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
Leonid Libkin (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 5 > >|

10 25 50

Libkin L (2012) Logic for Programming, Artificial Intelligence, and Reasoning

Kostylev E (2015) Complexity of answering counting aggregate queries over D L - L i t e in Journal of Web Semantics

Kolahi S (2008) An information-theoretic analysis of worst-case redundancy in database design in ACM Transactions on Database Systems

Hernich A (2011) Closed world data exchange in ACM Transactions on Database Systems

Gheerbrant A (2012) On the complexity of query answering over incomplete XML documents

Gheerbrant A (2014) Certain Answers over Incomplete XML Documents: Extending Tractability Boundary in Theory of Computing Systems

Gheerbrant A (2013) In Search of Elegance in the Theory and Practice of Computation

Gheerbrant A (2012) Complete Axiomatizations of Fragments of Monadic Second-Order Logic on Finite Trees in Logical Methods in Computer Science

Gheerbrant A (2014) Naïve Evaluation of Queries over Incomplete Databases in ACM Transactions on Database Systems

Gheerbrant A (2013) When is naive evaluation possible?

Key Findings
Impact Summary


Description	Data on the Web - particularly XML data - is often incomplete and inconsistent, due to such factors as the lack of centralisation and control over data quality. The flexibility of XML leads to many ways in which incompleteness of data can be accommodated, in addition to the standard relational null values. We have provided a detailed study of incompleteness in XML, and provided its applications in the area that underlies data management tasks on the Web -- the use of data across multiple independent applications. We classified models of XML with incomplete information and algorithmic techniques for querying such data, paying particular attention to the correctness/complexity tradeoffs and the practicality of algorithmic tools. We demonstrated the fundamental role of incompleness in applications that involve the movement of XML data, such as integration of data from various sources or moving data between peers according to mappings between their schemas. We developed a specification and algorithmic toolbox for dealing with incomplete information as it arises in such applications.
Exploitation Route	Processing XML data with incomplete and imprecise information
Sectors	Digital/Communication/Information Technologies (including Software)


Description	The key impacts are twofold: understanding the role of incompleteness in data exchange systems, and providing models of incompleteness in complex data models. The former had impact on the design of data exchange systems, the latter on the development of incompleteness models for complex structures used in today's data management tasks (graph data, RDF).
Sector	Digital/Communication/Information Technologies (including Software)

Abstract

Organisations

People

ORCID iD

Publications