Relational and XML Data Exchange: Semantics, Consistency, and Query Answering
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
Data exchange is one of the oldest data management problems. It arisesevery time two or more legacy databases need to exchangeinformation while their schemas cannot be changed. The main technicalchallenges are in building target database instances that correctlyrepresent information from the source data, and in evaluating querieson such databases in a semantically correct manner.The foundational aspects of data exchange had not been studied untilvery recently. Over the past few years, commercial products have appearedwhich help one manage e-business applications that communicate data yetremain autonomous. Such systems, however, use ad hoc query answeringtechniques, which motivates much of research on foundations of dataexchange. The goal of this project is to contribute towards thedevelopment of solid foundations for data exchange, concentrating onsuch critical issues as managing inherent incompleteness ofinformation in data exchange, using it in query answering, andextending techniques from relational databases to the exchange of datarepresented as XML documents.
Organisations
People |
ORCID iD |
Leonid Libkin (Principal Investigator) |
Publications
Alur R
(2008)
First-Order and Temporal Logics for Nested Words
in Logical Methods in Computer Science
Arenas M
(2008)
XML data exchange Consistency and query answering
in Journal of the ACM
Arenas M
(2008)
Game-based notions of locality over finite models
in Annals of Pure and Applied Logic
Arenas M
(2008)
On the Complexity of Verifying Consistency of XML Specifications
in SIAM Journal on Computing
Barcelo P
(2010)
Expressive languages for path queries over graph-structured data
Barceló P
(2009)
XML with incomplete information
Benedikt M
(2007)
Logical definability and query languages over ranked and unranked trees
in ACM Transactions on Computational Logic
David C
(2010)
Certain answers for XML queries
Description | The key findings can be split into two categories. The first one addresses the issue of semantics in data exchange. The second one is about developing the complete toolkit for XML data exchange. Regarding the first group of results, prior to our work everyone used a single model of data exchange, while admitting its obvious shortcomings. We explained that these shortcomings come from mishandling of incomplete information that naturally occurs in databases arising in data exchange. We developed a framework for performing key tasks of data exchange based on the semantics of incompleteness, and applied it in the scenarios of open world, closed world, and mixed semantics. Regrading XML data exchange, we developed, essentially from scratch, a complete toolkit for doing XML data exchange. It covers specification of mappings, their static analysis, building target solutions, and query answering. We provided a complete classification of classes of schema mappings based on the complexity of their static analyses; we classified schema mappings based on the behavior of query answering algorithms, and identified a large and practically relevant class of XML schema mappings that admits particularly efficient static analysis and query answering algorithms. We have answered long-standing open questions on the complexity of building solutions in data exchange, by providing an algorithm with tractable data complexity for materializing solutions. In addition, we have developed the basics for doing data exchange on instances with incomplete information, bringing the theory much closer to practice (so far, this work was done for relations, as a necessary first step towards extending it to XML). |
Exploitation Route | These have been used in open source software for providing both relational and XML data exchange. |
Sectors | Education |
Description | Data exchange: XML Our work on XML data exchange created the standard that others now use in their work, and provided algorithms that are being implemented in research prototypes. Data exchange: semantics Regarding the first group of results, prior to our work everyone used a single model of data exchange, while admitting its obvious shortcomings. We explained that these shortcomings come from mishandling of incomplete information that naturally occurs in databases arising in data exchange. We developed a framework for performing key tasks of data exchange based on the semantics of incompleteness, and applied it in the scenarios of open world, closed world, and mixed semantics. Our approach has since been used by many researchers to provide proper data exchange tools in scenarios where previously it could not be done (for instance, handling target constraints under closed world assumption, or dealing with aggregate queries) |
Sector | Digital/Communication/Information Technologies (including Software) |