Heterogeneous and Permanent Data
Lead Research Organisation:
University of Edinburgh
Department Name: Lab. for Foundations of Computer Science
Abstract
Since its inception a little more than four years ago the Database Group in the School of Informatics has grown to a leading database research group -- certainly the strongest in the UK and one of the strongest in the world. The group has also gained visibility by leading the research of the UK Digital Curation Centre. This application for a platform grant is to sustain that momentum and to provide the means to continue the group's interaction with the Digital Curation Centre.The work of the group is based on the proposition that our data resources are valuable, that they are necessarily heterogeneous in structure, and that, in the case of research data, we need to preserve that value for future researchers and scholars. The main research themes of the group are concerned with data exchange and integration, provenance and data quality, security, distributed data, data archiving. We have been particularly concerned with the advancement of these topics in relation to semistructured data such as XML and web data. It goes almost without saying that we cannot make much progress without understanding principles and building models, hence our involvement in database theory. Equally, database work is all about making things work efficiently, hence our extensive work in database systems and the many forms of optimisation related to storage and manipulation of data. The work of the PIs over first four years has been devoted to building up a critical mass of researchers and developing a good set of research topics. Having put our initial effort into this -- as well as into the concomitant effort of finding space, administrative support, hiring, teaching new courses, etc. -- it is now time to turn our energy to building new collaborative links with the UK and Europe. To this end we are initiating a UK-based collaborative project on data quality, European collaborations on database preservation and dynamic web data, new ties with the financial sector in Edinburgh and some e-science collaborations with the Digital Curation Centre. A platform grant will provide the flexibility to move our researchers onto these new projects and will allow us to respond rapidly to new research problems that we expect to arise in connection with all these areas.
Organisations
Publications
Acar U
(2010)
A Graph Model of Data and Workflow Provenance
Acar U.
(2010)
A graph model of data and workflow provenance
in 2nd Workshop on the Theory and Practice of Provenance, TaPP 2010
Amano S
(2009)
XML schema mappings
Barceló P
(2014)
Efficient Approximations of Conjunctive Queries
in SIAM Journal on Computing
Barceló P
(2012)
Efficient approximations of conjunctive queries
Barceló, P
(2009)
On Low Treewidth Approximations of Conjunctive Queries
Benedikt M
(2009)
Schema-based independence analysis for XML updates
in Proceedings of the VLDB Endowment
Buneman P
(2008)
Curated databases
Buneman P
(2009)
Curating the CIA World Factbook
in International Journal of Digital Curation
Buneman P
(2012)
Hierarchical models of provenance
Description | From a research perspective, the grant was instrumental in the transition and application of database research to graph databases. Rather than keeping data in highly-structured (and sometimes constraining) relational databases, much data is now stored in less or differently structured formats such as XML, JSON and RDF. This grant was key in the transition of database research to these formats. In addition the grant promoted new ideas in data cleaning and started the (computational) field of data citation. |
Exploitation Route | The following topics have been taken forward by others: Data cleaning Data exchange Data citation |
Sectors | Digital/Communication/Information Technologies (including Software) Culture Heritage Museums and Collections Other |
Description | Data Citation. Work on this started during the project and the topic is directly related to the themes of the project. It is now recognised as a major issue even by the EPSRC itself in their blurbs about data. Not sure if this is "non-academic", but it is pervasive in all forms of scholarship. The computational principles are now being implemented in various areas. Rural Networks. Only distantly related to this project, but since Researchfish seems to think that it is (see related grants) It is worth pointing out that we have now built the biggest rural network in the UK and literally thousands of people are benefiting from high speed internet in the most remote parts of Scotland. |
First Year Of Impact | 2010 |
Sector | Digital/Communication/Information Technologies (including Software),Other |
Impact Types | Cultural Societal Economic Policy & public services |
Description | Carnegie UK Trust |
Amount | £40,000 (GBP) |
Organisation | Carnegie Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 05/2012 |
End | 06/2013 |
Description | Carnegie UK Trust |
Amount | £40,000 (GBP) |
Organisation | Carnegie Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 05/2012 |
End | 06/2013 |