Quaid - a platform for improving data quality

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Information holds the key to success. The accuracy of data determines operational performance, regulatory compliance and the effectiveness of business strategy. Organisations in every industry worldwide have an increased awareness of the costs and risks caused by data that is inconsistent, inaccurate, stale or deliberately falsified. The Data Warehousing Institute estimates that poor quality data costs US businesses $600 billion annually. Many companies are investing in data quality solutions that help increase transparency and productivity and, as a result, the data quality market is experiencing rapid growth. Whilst these companies are making progress on internal clean up and consolidation tasks, such activities require large amounts of manual effort. A new breakthrough which provides theoretical background and practical algorithms for data quality management has been pioneered at the School of Informatics. This approach, based on a novel extension of classical dependency theory, increases the level of automation and improves accuracy in the data quality process. In 2008, Prof. Wenfei Fan was awarded with the British Computer Society Roger Needham award along with the Chinese Yangtze River Scholar award for his research in this area. Building on the output of this award-winning research, we aim to deliver a concept system, Quaid, which scales to real commercial datasets and addresses the needs of industrial customers. Through Quaid, we envisage new products and services will be generated from existing digital data sources.

Publications

10 25 50
publication icon
Fan W (2010) Dynamic constraints for record matching in The VLDB Journal

publication icon
Fan W (2010) Relative information completeness in ACM Transactions on Database Systems

publication icon
Fan W (2011) Discovering Conditional Functional Dependencies in IEEE Transactions on Knowledge and Data Engineering

 
Description A functional data repairing system for critical data, with performance guarantees on the accuracy.
Exploitation Route Practitioners are able to develop their data quality systems based on our models. There are also a series of future work that can be carried out by researchers, as pointed out in our publications.
Sectors Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Healthcare

URL http://homepages.inf.ed.ac.uk/wenfei/publication.html
 
Description CerFix, a system for repairing critical data with accuracy guarantees, based on our conditional dependency theory and the model of certain fixes. The work on certain fixes received the best paper award for VLDB 2010, the top-ranked all-round international database conference. The system was demonstrated at VLDB 2011.
First Year Of Impact 2010
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic