XML with Incomplete Information: Representation, Querying, and Applications
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
Data on the Web - particularly XML data - is often incomplete andinconsistent, due to such factors as the lack of centralisation andcontrol over data quality. While the transfer and extension ofrelational tools to deal with XML data has been a central theme indata management research over the past decade, the standard databasetoolbox offers us little in terms of handling ofincompleteness. Indeed, it is one of the most notoriouslyunderdeveloped and most often criticised aspects of relationaldatabases. In addition, the flexibility of XML leads to many ways inwhich incompleteness of data can be accommodated, in addition to thestandard relational null values.There has not yet been any detailed study of incompleteness in XML.Our main goal is to conduct such a systematic study, and develop itsapplications in the area that underlies data management tasks on theWeb -- the use of data across multiple independent applications.We shall investigate models of XML with incomplete information andalgorithmic techniques for querying such data, paying particularattention to the correctness/complexity tradeoffs and to the practicalityof algorithmic tools. We shall investigate the fundamental role ofincompleteness in applications that involve the movement of XML data,such as integration of data from various sources or moving databetween peers according to mappings between their schemas. We shalldevelop a specification and algorithmic toolbox for dealing withincomplete information as it arises in such applications.
Organisations
People |
ORCID iD |
Leonid Libkin (Principal Investigator) |
Publications
Amano S
(2014)
XML Schema Mappings Data Exchange and Metadata Management
in Journal of the ACM
Amano S
(2009)
XML schema mappings
Arenas M
(2010)
Relational and XML Data Exchange
in Synthesis Lectures on Data Management
Arenas M
(2010)
Regular Languages of Nested Words: Fixed Points, Automata, and Synchronization
in Theory of Computing Systems
Arenas M
(2013)
Solutions and query rewriting in data exchange
in Information and Computation
Atkey R
(2010)
Programming Languages and Systems
Barcelo P
(2013)
Graph Logics with Rational Relations
in Logical Methods in Computer Science
Barcelo, P
(2012)
On Low Treewidth Approximations of Conjunctive Queries
in Proceedings of the 6th Alberto Mendelzon International Workshop on Foundations of Data Management
Barceló P
(2011)
Querying graph patterns
Description | Data on the Web - particularly XML data - is often incomplete and inconsistent, due to such factors as the lack of centralisation and control over data quality. The flexibility of XML leads to many ways in which incompleteness of data can be accommodated, in addition to the standard relational null values. We have provided a detailed study of incompleteness in XML, and provided its applications in the area that underlies data management tasks on the Web -- the use of data across multiple independent applications. We classified models of XML with incomplete information and algorithmic techniques for querying such data, paying particular attention to the correctness/complexity tradeoffs and the practicality of algorithmic tools. We demonstrated the fundamental role of incompleness in applications that involve the movement of XML data, such as integration of data from various sources or moving data between peers according to mappings between their schemas. We developed a specification and algorithmic toolbox for dealing with incomplete information as it arises in such applications. |
Exploitation Route | Processing XML data with incomplete and imprecise information |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The key impacts are twofold: understanding the role of incompleteness in data exchange systems, and providing models of incompleteness in complex data models. The former had impact on the design of data exchange systems, the latter on the development of incompleteness models for complex structures used in today's data management tasks (graph data, RDF). |
Sector | Digital/Communication/Information Technologies (including Software) |