XML with Incomplete Information: Representation, Querying, and Applications
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
Data on the Web - particularly XML data - is often incomplete andinconsistent, due to such factors as the lack of centralisation andcontrol over data quality. While the transfer and extension ofrelational tools to deal with XML data has been a central theme indata management research over the past decade, the standard databasetoolbox offers us little in terms of handling ofincompleteness. Indeed, it is one of the most notoriouslyunderdeveloped and most often criticised aspects of relationaldatabases. In addition, the flexibility of XML leads to many ways inwhich incompleteness of data can be accommodated, in addition to thestandard relational null values.There has not yet been any detailed study of incompleteness in XML.Our main goal is to conduct such a systematic study, and develop itsapplications in the area that underlies data management tasks on theWeb -- the use of data across multiple independent applications.We shall investigate models of XML with incomplete information andalgorithmic techniques for querying such data, paying particularattention to the correctness/complexity tradeoffs and to the practicalityof algorithmic tools. We shall investigate the fundamental role ofincompleteness in applications that involve the movement of XML data,such as integration of data from various sources or moving databetween peers according to mappings between their schemas. We shalldevelop a specification and algorithmic toolbox for dealing withincomplete information as it arises in such applications.
Organisations
People |
ORCID iD |
Leonid Libkin (Principal Investigator) |
Publications
Libkin L
(2012)
Logic for Programming, Artificial Intelligence, and Reasoning
Kostylev E
(2015)
Complexity of answering counting aggregate queries over D L - L i t e
in Journal of Web Semantics
Kolahi S
(2008)
An information-theoretic analysis of worst-case redundancy in database design
in ACM Transactions on Database Systems
Hernich A
(2011)
Closed world data exchange
in ACM Transactions on Database Systems
Gheerbrant A
(2012)
On the complexity of query answering over incomplete XML documents
Gheerbrant A
(2014)
Certain Answers over Incomplete XML Documents: Extending Tractability Boundary
in Theory of Computing Systems
Gheerbrant A
(2013)
In Search of Elegance in the Theory and Practice of Computation
Gheerbrant A
(2012)
Complete Axiomatizations of Fragments of Monadic Second-Order Logic on Finite Trees
in Logical Methods in Computer Science
Gheerbrant A
(2014)
Naïve Evaluation of Queries over Incomplete Databases
in ACM Transactions on Database Systems
Gheerbrant A
(2013)
When is naive evaluation possible?
Description | Data on the Web - particularly XML data - is often incomplete and inconsistent, due to such factors as the lack of centralisation and control over data quality. The flexibility of XML leads to many ways in which incompleteness of data can be accommodated, in addition to the standard relational null values. We have provided a detailed study of incompleteness in XML, and provided its applications in the area that underlies data management tasks on the Web -- the use of data across multiple independent applications. We classified models of XML with incomplete information and algorithmic techniques for querying such data, paying particular attention to the correctness/complexity tradeoffs and the practicality of algorithmic tools. We demonstrated the fundamental role of incompleness in applications that involve the movement of XML data, such as integration of data from various sources or moving data between peers according to mappings between their schemas. We developed a specification and algorithmic toolbox for dealing with incomplete information as it arises in such applications. |
Exploitation Route | Processing XML data with incomplete and imprecise information |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The key impacts are twofold: understanding the role of incompleteness in data exchange systems, and providing models of incompleteness in complex data models. The former had impact on the design of data exchange systems, the latter on the development of incompleteness models for complex structures used in today's data management tasks (graph data, RDF). |
Sector | Digital/Communication/Information Technologies (including Software) |