Enforcement of Constraints on XML Streams
Lead Research Organisation:
University of Oxford
Department Name: Computer Science
Abstract
The eXtensible Markup Language (XML) has become a ubiquitous format for exchanging data. Enterprise data from industries as diverse as finance, healthcare, and genomics are routinely exchanged as XML. Much of this XML-encoded information has to be queried on-the-fly as it arrives -- that is, as an XML stream. News and event information, for example, is available in the form of XML feeds; applications that react to these events must process the feeds in streaming fashion. Communication and messaging protocols also make use of XML, and the corresponding protocol handlers are thus also XML stream-processors.A crucial aspect of processing any form of data is validation: before data is made available to applications, it must be in a sane'' state. In the context of data being exchanged over networks data corruption is ubiquitous, because messages are received from untrusted or even unknown parties. Indeed, many or even most of the data being sent to web-accessible application servers may be from malicious or compromised hosts.The XML community has already developed standardized means for describing constraints on the structure of XML documents. On the one hand, there are schema-based constraints, such as Document Type Definitions (DTDs) giving limitations on the tags that can occur within a document. Qualifiers in the XML query language XPath provide a more flexible method for adding application-specific constraints. But how can a firewall enforce these constraints efficiently on large collections of parallel feeds? This is a critical issue, whether the XML streams represent signalling messages, event feeds, or web service calls. This project will study which constraints can and cannot be enforced efficiently, and will provide tools and technologies to effectively monitor XML streams for violation of both schema constraints and application-specific constraints.
Organisations
People |
ORCID iD |
Michael Benedikt (Principal Investigator) |
Publications
Benedikt M
(2013)
Report on the first workshop on innovative querying of streams
in ACM SIGMOD Record
Benedikt M
(2010)
Report on the EDBT/ICDT 2010 workshop on updates in XML
in ACM SIGMOD Record
Amarilli A
(2020)
Finite Open-world Query Answering with Number Restrictions
in ACM Transactions on Computational Logic
Benedikt M
(2009)
Regular tree languages definable in FO and in FO mod
in ACM Transactions on Computational Logic
Benaim S
(2016)
Complexity of Two-Variable Logic on Finite Trees
in ACM Transactions on Computational Logic
Benedikt M
(2016)
Limiting Until in Ordered Tree Query Languages
in ACM Transactions on Computational Logic
Bourhis P
(2016)
Bounded Repairability for Regular Tree Languages
in ACM Transactions on Database Systems
Benedikt M
(2009)
From XQuery to relational logics
in ACM Transactions on Database Systems
Benedikt M
(2010)
What you must remember when processing data words
in CEUR Workshop Proceedings
Benedikt M
(2015)
The complexity of higher-order queries
in Information and Computation
Benedikt M
(2013)
Bounded repairability of word languages
in Journal of Computer and System Sciences
Benedikt M
(2017)
Determinacy and rewriting of functional top-down and MSO tree transformations
in Journal of Computer and System Sciences
Benedikt M
(2012)
Querying schemas with access restrictions
in Proceedings of the VLDB Endowment
Benedikt M
(2010)
Probabilistic XML via Markov Chains
in Proceedings of the VLDB Endowment
Benedikt M
(2010)
Destabilizers and independence of XML updates
in Proceedings of the VLDB Endowment
Benedikt M
(2014)
Towards a characterization of order-invariant queries over tame graphs
in The Journal of Symbolic Logic
Benedikt M
(2014)
The per-character cost of repairing word languages
in Theoretical Computer Science
Bousquet-Mélou M
(2014)
XML Compression via Directed Acyclic Graphs
in Theory of Computing Systems
Benedikt M
(2011)
Regular Repair of Specifications
Vu H
(2011)
Complexity of higher-order queries
Ley C
(2009)
How big must complete XML query languages be?
Benedikt M
(2009)
Database Programming Languages
Amarilli A
(2020)
Finite Open-World Query Answering with Number Restrictions
Amarilli A
(2015)
Finite Open-World Query Answering with Number Restrictions
Puppis G
(2012)
Bounded repairability for regular tree languages
Benedikt M
(2011)
CONCUR 2011 - Concurrency Theory
Benaim S
(2013)
Automata, Languages, and Programming
Benedikt M
(2010)
Positive higher-order queries
Benedikt M
(2011)
Automata, Languages and Programming
Amarilli A
(2015)
Combining Existential Rules and Description Logics (Extended Version)
Benedikt M
(2013)
Mathematical Foundations of Computer Science 2013
Description | We developed techniques for transforming and repairing streams of structured data. |
Exploitation Route | The stream processors could be used for noticing anomalies in (e.g.) news feeds. |
Sectors | Digital/Communication/Information Technologies (including Software) |