Query-driven Data Acquisition from Web-based Data Sources
Lead Research Organisation:
University of Oxford
Department Name: Computer Science
Abstract
The functioning of entities as diverse as enterprises and government agencies depends onobtaining high-quality data.Increasingly these entities depend on external sourcesfor their operational data: critical datais obtained dynamically via web services, is extracted from web pages,or is purchased from third parties. These sources can differ radicallyin their completeness, accuracy, and availability. It is not possible for applications to indexand explore data from each source in advance of querying:there are too many sources, they are too costly to access, and the data in themmay be refreshed constantly. How should data acquisition proceed in such situations?In this project we will develop algorithms for answering queries in the presence of large numbers ofweb-based data sources, sources that may overlap substantially in their datasetsbut have different access restrictions and costs. Our approach will make use of schema information about thedata an application is querying: data format, integrity constraints, and any prior knowledge of costs that maybe available. The core of the project will be algorithms for answering a query by interactively exploring the sources,dynamically pruning out irrelevant or exhausted sources in the process.
Organisations
People |
ORCID iD |
Michael Benedikt (Principal Investigator) |
Publications
Amarilli A
(2020)
Finite Open-world Query Answering with Number Restrictions
in ACM Transactions on Computational Logic
Amarilli A
(2016)
Query Answering with Transitive and Linear-Ordered Data
Amarilli A
(2017)
When Can We Answer Queries Using Result-Bounded Data Interfaces?
in CoRR abs
Barany V
(2020)
Some Model Theory of Guarded Negation
Benedikt M
(2012)
ProFoUnd
Benedikt M
(2017)
Characterizing Definability in Decidable Fixpoint Logics
Benedikt M
(2013)
Two Variable vs. Linear Temporal Logic in Model Checking and Games
in Logical Methods in Computer Science
Benedikt M
(2016)
Generating Plans from Proofs
in ACM Transactions on Database Systems
Benedikt M
(2011)
CONCUR 2011 - Concurrency Theory
Benedikt M
(2012)
Automata, Languages, and Programming
Description | We discovered that the query optimization can be approached via proof-theoretic methods, and that different proof systems can lead to new query optimization algorithms. |
Exploitation Route | We have created a query optimization system based on them, which we are developing with a customer in a follow-up grant. |
Sectors | Digital/Communication/Information Technologies (including Software) Retail |
Description | Invited talk in Chile |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Undergraduate students |
Results and Impact | Invited talk in the main seminar of Pontifical Catholic University of Chile's mathematics department. |
Year(s) Of Engagement Activity | 2014 |
URL | https://www.ing.uc.cl/ingenieria-matematica/7-seminario-ingenieria-matematica-2/ |
Description | Invited tutorial at workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | I was invited to give a tutorial on query reformulation at the main summer school in Data Management, associated to the Alberto Mendelzon Workshop on Management of Data. |
Year(s) Of Engagement Activity | 2014 |
Description | Keynote at database workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Keynote talk on query optimisation over web datasources at workshop on data management. |
Year(s) Of Engagement Activity | 2014 |
URL | https://users.dcc.uchile.cl/~jperez/amw2014/ |
Description | Keynote at main workshop on Description Logics |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Invited keynote on new approaches to query reformulation in databases at the main meeting for research in Description Logics (DL 2014). |
Year(s) Of Engagement Activity | 2014 |
URL | https://www.dbai.tuwien.ac.at/dl2014/ |
Description | Organization of Workshop on Ontologies and Data Management |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Lead and co-organizer of a workshop at the Dagstuhl center for computer science, Europe's leading venue for computer science seminars and workshops. The workshop dealt with the interface of data management, logic, and semantic web research, including researchers from each of these areas. |
Year(s) Of Engagement Activity | 2014 |
URL | http://drops.dagstuhl.de/opus/volltexte/2014/4794/ |
Description | Summer school course on Logic and Data Management |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Presented a 1-week short course on logical issues in data management at one of the main european summer schools, the European Summer School on Logic, Language and Information. |
Year(s) Of Engagement Activity | 2014 |
URL | http://www.evolaemp.uni-tuebingen.de/esslli2014/program/week-two/ |