VADA: Value Added Data Systems -- Principles and Architecture
Lead Research Organisation:
University of Oxford
Department Name: Computer Science
Abstract
Data is everywhere, generated by increasing numbers of applications, devices and users, with few or no guarantees on the format, semantics, and quality. The economic potential of data-driven innovation is enormous, estimated to reach as much as £40B in 2017, by the Centre for Economics and Business Research. To realise this potential, and to provide meaningful data analyses, data scientists must first spend a significant portion of their time (estimated as 50% to 80%) on "data wrangling" - the process of collection, reorganising, and cleaning data.
This heavy toll is due to what is referred as the four V's of big data: Volume - the scale of the data, Velocity - speed of change, Variety - different forms of data, and Veracity - uncertainty of data. There is an urgent need to provide data scientists with a new generation of tools that will unlock the potential of data assets and significantly reduce the data wrangling component. As many traditional tools are no longer applicable in the 4 V's environment, a radical paradigm shift is required. The proposal aims at achieving this paradigm shift by adding value to data, by handling data management tasks in an environment that is fully aware of data and user contexts, and by closely integrating key data management tasks in a way not yet attempted, but desperately needed by many innovative companies in today's data-driven economy.
The VADA research programme will define principles and solutions for Value Added Data Systems, which support users in discovering, extracting, integrating, accessing and interpreting the data of relevance to their questions. In so doing, it uses the context of the user, e.g., requirements in terms of the trade-off between completeness and correctness, and the data context, e.g., its availability, cost, provenance and quality. The user context characterises not only what data is relevant, but also the properties it must exhibit to be fit for purpose. Adding value to data then involves the best effort provision of data to users, along with comprehensive information on the quality and origin of the data provided. Users can provide feedback on the results obtained, enabling changes to all data management tasks, and thus a continuous improvement in the user experience.
Establishing the principles behind Value Added Data Systems requires a revolutionary approach to data management, informed by interlinked research in data extraction, data integration, data quality, provenance, query answering, and reasoning. This will enable each of these areas to benefit from synergies with the others. Research has developed focused results within such sub-disciplines; VADA develops these specialisms in ways that both transform the techniques within the sub-disciplines and enable the development of architectures that bring them together to add value to data.
The commercial importance of the research area has been widely recognised. The VADA programme brings together university researchers with commercial partners who are in desperate need of a new generation of data management tools. They will be contributing to the programme by funding research staff and students, providing substantial amounts of staff time for research collaborations, supporting internships, hosting visitors, contributing challenging real-life case studies, sharing experiences, and participating in technical meetings. These partners are both developers of data management technologies (LogicBlox, Microsoft, Neo) and data user organisations in healthcare (The Christie), e-commerce (LambdaTek, PricePanda), finance (AllianceBernstein), social networks (Facebook), security (Horus), smart cities (FutureEverything), and telecommunications (Huawei).
This heavy toll is due to what is referred as the four V's of big data: Volume - the scale of the data, Velocity - speed of change, Variety - different forms of data, and Veracity - uncertainty of data. There is an urgent need to provide data scientists with a new generation of tools that will unlock the potential of data assets and significantly reduce the data wrangling component. As many traditional tools are no longer applicable in the 4 V's environment, a radical paradigm shift is required. The proposal aims at achieving this paradigm shift by adding value to data, by handling data management tasks in an environment that is fully aware of data and user contexts, and by closely integrating key data management tasks in a way not yet attempted, but desperately needed by many innovative companies in today's data-driven economy.
The VADA research programme will define principles and solutions for Value Added Data Systems, which support users in discovering, extracting, integrating, accessing and interpreting the data of relevance to their questions. In so doing, it uses the context of the user, e.g., requirements in terms of the trade-off between completeness and correctness, and the data context, e.g., its availability, cost, provenance and quality. The user context characterises not only what data is relevant, but also the properties it must exhibit to be fit for purpose. Adding value to data then involves the best effort provision of data to users, along with comprehensive information on the quality and origin of the data provided. Users can provide feedback on the results obtained, enabling changes to all data management tasks, and thus a continuous improvement in the user experience.
Establishing the principles behind Value Added Data Systems requires a revolutionary approach to data management, informed by interlinked research in data extraction, data integration, data quality, provenance, query answering, and reasoning. This will enable each of these areas to benefit from synergies with the others. Research has developed focused results within such sub-disciplines; VADA develops these specialisms in ways that both transform the techniques within the sub-disciplines and enable the development of architectures that bring them together to add value to data.
The commercial importance of the research area has been widely recognised. The VADA programme brings together university researchers with commercial partners who are in desperate need of a new generation of data management tools. They will be contributing to the programme by funding research staff and students, providing substantial amounts of staff time for research collaborations, supporting internships, hosting visitors, contributing challenging real-life case studies, sharing experiences, and participating in technical meetings. These partners are both developers of data management technologies (LogicBlox, Microsoft, Neo) and data user organisations in healthcare (The Christie), e-commerce (LambdaTek, PricePanda), finance (AllianceBernstein), social networks (Facebook), security (Horus), smart cities (FutureEverything), and telecommunications (Huawei).
Planned Impact
The economic impact of relevant activities is difficult to approximate, but the value of the sub-areas of Big Data, Data Integration and Data Quality is forecast to be over $50B by 2017:
- The International Institute of Analytics estimate the Big Data market at $16.1B in 2014, growing 6 times faster than the overall IT market. Projection for 2017 is ~$50B.
- Gartner (2014) estimates the Data Integration tool market at over $2.2B at end 2013, an increase of 9.4% from 2012. Growth rate is above average for the enterprise software market. By 2018 total revenue should be ~$3.6B
- Gartner (2014) estimates the Data Quality market as $960M in software revenue at end 2012 ($2B by 2017), an increase of 12.3% from 2011.
Thus directly associated markets - with users across government, industry, health and commerce - are large and fast growing.
Who will benefit from this research?
Data is central to the efficient operation of many technology development and user organisations, and is the raison d'etre for many others. Here we categorise potential VADA beneficiaries, into:
1. Technology providers of platforms and solutions for collecting, integrating, and aggregating data. Partner examples include LogicBlox, Microsoft, Neo. New business opportunities are likely to emerge, where impact results from the development of techniques to enable more efficient and effective use of available data.
2. Organisations having a need for such platforms. This is almost every organization; our partners include knowledge companies who work with product (LambdaTek, PricePanda), financial (AllianceBernstein), security (Horus), social networking (Facebook), telecommunications (Huawei), governmental (FutureEverything) and healthcare (Christie) data.
All partners have highlighted the importance of this research in their support letters:
* VADA addresses fundamental questions that have great significance (Microsoft),
* The challenge addressed by VADA is a significant one (LogicBlox),
* VADA tackles several problems that are of great interest (Facebook),
* We need an automatic approach to reliable, timely and continuous collection and evaluation of sources against an ever-increasing amount of raw data. Current data collection technologies are neither reliable nor scalable enough. (Horus)
* To remain competitive we need to enrich our product data with extended background data. No technology that currently exists can do this. (LambdaTek).
How might they benefit from this research?
VADA's impact is in line with the RCUK priorities:
1. Contribute toward wealth creation and economic prosperity. VADA will develop techniques and methodologies informing the development of platforms to add value to data. Among the many mechanisms that can realise this, we propose a consultancy spin-out. We believe that this will ease the efficient transfer of knowledge from academia to UK industry, as previously demonstrated by similar successful ventures.
2. Shape/enhance effectiveness of public services. The UK has signed up to the Open Government Declaration, which should make travel easier and healthcare better, and create significant growth for UK industry (http://www.cabinetoffice.gov.uk/news/open-data-measures-autumn-statement). However, exploiting such data involves inter-relating it with other data sources, managing variety and veracity. SMEs such as FutureEverything will benefit from efficient techniques for adding value to such data.
3. Enhance training capacity, knowledge and skills of businesses and organisations. Within many organisations, efficient sharing and use of data is crucial for decision-making. VADA will directly train 11 PhD students, supporting exchange visits, workshops, and a summer school. VADA's academics will be also involved in the design of training courses on Value Added Data Systems for the next generation of higher education post-graduate programmes and skill training courses for the industry.
- The International Institute of Analytics estimate the Big Data market at $16.1B in 2014, growing 6 times faster than the overall IT market. Projection for 2017 is ~$50B.
- Gartner (2014) estimates the Data Integration tool market at over $2.2B at end 2013, an increase of 9.4% from 2012. Growth rate is above average for the enterprise software market. By 2018 total revenue should be ~$3.6B
- Gartner (2014) estimates the Data Quality market as $960M in software revenue at end 2012 ($2B by 2017), an increase of 12.3% from 2011.
Thus directly associated markets - with users across government, industry, health and commerce - are large and fast growing.
Who will benefit from this research?
Data is central to the efficient operation of many technology development and user organisations, and is the raison d'etre for many others. Here we categorise potential VADA beneficiaries, into:
1. Technology providers of platforms and solutions for collecting, integrating, and aggregating data. Partner examples include LogicBlox, Microsoft, Neo. New business opportunities are likely to emerge, where impact results from the development of techniques to enable more efficient and effective use of available data.
2. Organisations having a need for such platforms. This is almost every organization; our partners include knowledge companies who work with product (LambdaTek, PricePanda), financial (AllianceBernstein), security (Horus), social networking (Facebook), telecommunications (Huawei), governmental (FutureEverything) and healthcare (Christie) data.
All partners have highlighted the importance of this research in their support letters:
* VADA addresses fundamental questions that have great significance (Microsoft),
* The challenge addressed by VADA is a significant one (LogicBlox),
* VADA tackles several problems that are of great interest (Facebook),
* We need an automatic approach to reliable, timely and continuous collection and evaluation of sources against an ever-increasing amount of raw data. Current data collection technologies are neither reliable nor scalable enough. (Horus)
* To remain competitive we need to enrich our product data with extended background data. No technology that currently exists can do this. (LambdaTek).
How might they benefit from this research?
VADA's impact is in line with the RCUK priorities:
1. Contribute toward wealth creation and economic prosperity. VADA will develop techniques and methodologies informing the development of platforms to add value to data. Among the many mechanisms that can realise this, we propose a consultancy spin-out. We believe that this will ease the efficient transfer of knowledge from academia to UK industry, as previously demonstrated by similar successful ventures.
2. Shape/enhance effectiveness of public services. The UK has signed up to the Open Government Declaration, which should make travel easier and healthcare better, and create significant growth for UK industry (http://www.cabinetoffice.gov.uk/news/open-data-measures-autumn-statement). However, exploiting such data involves inter-relating it with other data sources, managing variety and veracity. SMEs such as FutureEverything will benefit from efficient techniques for adding value to such data.
3. Enhance training capacity, knowledge and skills of businesses and organisations. Within many organisations, efficient sharing and use of data is crucial for decision-making. VADA will directly train 11 PhD students, supporting exchange visits, workshops, and a summer school. VADA's academics will be also involved in the design of training courses on Value Added Data Systems for the next generation of higher education post-graduate programmes and skill training courses for the industry.
Organisations
- University of Oxford (Lead Research Organisation)
- Neo4j (Collaboration)
- Bank of Italy (Collaboration)
- Peak AI (Collaboration)
- Linked Data Benchmark Council (LDBC) (Collaboration)
- AllianceBernstein plc. (Project Partner)
- PricePanda Group (Project Partner)
- Facebook (United States) (Project Partner)
- The Christie Hospital (Project Partner)
- Microsoft (United States) (Project Partner)
- FutureEverything (Project Partner)
- Logicblox (Project Partner)
- Huawei Technologies (China) (Project Partner)
- Horus Security Consultancy Ltd (Project Partner)
- LambdaTek (Project Partner)
- Neo Technology UK (Neo4J) (Project Partner)
Publications
Abboud R
(2020)
Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting
in Proceedings of the AAAI Conference on Artificial Intelligence
Abboud R
(2022)
Approximate weighted model integration on DNF structures
in Artificial Intelligence
Abboud R.
(2020)
Learning to reason: Leveraging neural networks for approximate dnf counting
in AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Abel E
(2020)
Pairwise comparisons or constrained optimization? A usability evaluation of techniques for eliciting decision priorities
in International Transactions in Operational Research
Abel E
(2018)
SOURCERY
Abel E
(2018)
User driven multi-criteria source selection
in Information Sciences
Abel E
(2020)
Targeted evidence collection for uncertain supplier selection
in Expert Systems with Applications
Amato G
(2022)
Dynamic Responsive Inguinal Scaffold Activates Myogenic Growth Factors Finalizing the Regeneration of the Herniated Groin.
in Journal of functional biomaterials
Description | We have provided a complete spectrum of steps required for data science activities in the context of Vadalog including 1) Data Integration and Pre-processing, 2) Statistical Analysis, 3) Machine Learning, 4) Algorithmic Modelling, 5) Probabilistic Reasoning. In each of these parts there has been significant improvements including involvement of further Machine Learning approaches. We have extended the research-related activities of Vadalog on the side of involving further machine learning techniques. This involves extensive studies on Neural Networks and Knowledge Graph Embedding Models. Recent work in theses areas has shown the importance of logical rules. As the core of Vadalog is rule-based reasoning, we launched related work in logical rule injection in such ML-related approaches. As one of the characteristics of Knowledge Graphs is their uncertainty in terms of noisy, missing, and incorrect data, we investigated the effect of noise in the presence of logical rules. We show that by introducing a new loss function that is both pattern-aware and noise-resilient, significant performance issues can be solved. |
Exploitation Route | 1. The newly developed and researched models from machine learning-based approaches can significantly increase the results of rule mining and reasoning processes beside improving execution officially of Vadalog system. 2. Embedding models are specially designed for link prediction tasks and this characteristic can be used in making more complex steps of logic-based reasoning more efficient. |
Sectors | Aerospace Defence and Marine Digital/Communication/Information Technologies (including Software) Education Retail Transport |
URL | http://vada.org.uk/ |
Description | 1. The findings have been used in DBLP to improve their web data extraction approach. In particular, the enhanced version of OXPath is able to extract data from complex web applications, which was not possible earlier. 2. Other non-academic/company collaborations have been conducted. Results are subject to further progress on these collaborations. 3. After becoming familiar with VADA supported work on mappings of property graphs and on formal semantics of query languages, Neo Technology funded a followup project on providing formal underpinnings and semantics of the Cypher query language of their Neo4j graph database system. 4. A series of meetings has been held with BAE Systems. Significant human effort is currently needed to manually process and integrate data, which can delay effective decision making. As a result, there is interest in the combination of decision support and data integration that is being explored in VADA, and we hope this will lead to a research collaboration in due course. 5. The impact of machine learning approaches and embedding models have been examined with a use case in financial domain and planned top be extended in industry-scale. 6. VADA has significant impact on research partners, in particular recently on leading appliance company Miele and leading bank Sberbank 7. The Central Bank of Italy has continuos interest in the VADA and Vadalog system. There is ample evidence in form of publications of the ongoing and fruitful scientific collaboration. |
First Year Of Impact | 2019 |
Sector | Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software),Education,Financial Services, and Management Consultancy |
Impact Types | Societal Economic Policy & public services |
Description | Amazon Web Services Research Credits |
Amount | $10,760 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start | 08/2018 |
End | 09/2019 |
Description | EPSRC Industrial CASE Doctoral Studentship: AI and Cognitive Computing for Reasoning about Big Data with Application to the Oil and Gas Industry |
Amount | £200,000 (GBP) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2017 |
End | 04/2021 |
Description | Efficient Querying of Inconsistent Data |
Amount | £606,439 (GBP) |
Funding ID | EP/S003800/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 08/2018 |
End | 08/2024 |
Description | Innovate UK Internet of Things Cities Demonstrator |
Amount | £856,996 (GBP) |
Organisation | Innovate UK |
Sector | Public |
Country | United Kingdom |
Start | 06/2016 |
End | 06/2018 |
Description | LAMBDA: Learning, Applying, Multiplying Big Data Analytics |
Amount | £168,835 (GBP) |
Funding ID | GA No. 809965 |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 06/2018 |
End | 12/2020 |
Description | Neo Technology - industry funding |
Amount | £180,000 (GBP) |
Organisation | Neo4j |
Sector | Private |
Country | United States |
Start | 01/2017 |
End | 09/2021 |
Description | New generation of graph query languages |
Amount | £59,999 (GBP) |
Organisation | The Leverhulme Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 08/2022 |
End | 03/2024 |
Description | RS Wolfson |
Amount | £50,000 (GBP) |
Organisation | The Royal Society |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 01/2017 |
End | 12/2021 |
Description | Raison Data - Royal Society Research Professorship |
Amount | £1,304,142 (GGP) |
Funding ID | RP\R1\201074 |
Organisation | The Royal Society |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2020 |
End | 02/2025 |
Description | Ratiolytics: a rule-based AI system for reasoning, data wrangling and analytics |
Amount | £44,123 (GBP) |
Funding ID | EP/R511742/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 06/2017 |
End | 04/2018 |
Description | Bank of Italy |
Organisation | Bank of Italy |
Country | Italy |
Sector | Public |
PI Contribution | Bank of Italy (Banca d'Italia - Italy's National Bank) uses VADALOG, the reasoning language generated by the VADA project. Oxford hosted a Bank of Italy Senior Engineer (Luigi Bellomarini) and introduced him to the VADA technology and to the underlying research. |
Collaborator Contribution | The Bank of Italy adopted the VADALOG language and system and obtained a license from the VADA startup Deep Reaon.ai Based on this software, BoI introduced us to many new use cases and contributed to various research papers. The cooperation has been ongoing from the academic year 2016/17 until beyond the end of the project. . |
Impact | The outputs are all publication in the list of publications co-authored by Dr Luigi Bellomarini. Please just search for "Bellomarini in th epublication list. The collaboration is in the field of Computer Science with application to Central Bank problems and Economics. For example new methods for detecting the degree of ownership between two companies were deleloped. |
Start Year | 2016 |
Description | LDBC |
Organisation | Linked Data Benchmark Council (LDBC) |
Country | United Kingdom |
Sector | Charity/Non Profit |
PI Contribution | LDBC is a key international organisation supported by multiple industrial partners (Oracle, Neo4j, AWS, TigerGraph etc) in charge of bringing together industry and academia in developing new standards and benchmarks for graph databases. I chair one of its working groups, on formal semantics of query languages, and actively participate in two others (on treatment of null values and on property graph schemas). |
Collaborator Contribution | They provide us with tools to facilitate collaboration. |
Impact | The most visible one so far is the forthcoming SIGMOD 2021 paper on keys for property graphs. Others go via ISO. |
Start Year | 2020 |
Description | Neo technology (Neo4j) |
Organisation | Neo4j |
Country | United States |
Sector | Private |
PI Contribution | We established a joint research project with a leading vendor of graph databases, Neo Technology (based in the UK and Sweden; the name of their product is Neo4j). Our initial goal was to produce a formal semantics of their query language Cypher and to make suggestions about further development of the language. It was then expanded to the design of the new graph query language GQL. |
Collaborator Contribution | In addition to committing significant amount of time of the core staff, Neo4j has provided funding continuously since 2017. |
Impact | Full formal semantics of the core language has been developed. Paper describing appeared in SIGMOD 2018 and VLDB 2019. Since then the focus shifted to GQL (paper are to be written). |
Start Year | 2017 |
Description | Peak.ai |
Organisation | Peak AI |
Country | United Kingdom |
Sector | Private |
PI Contribution | We are working with Peak.ai through a Knowledge Transfer Partnership, to develop and apply techniques for entity resolution and data discovery. |
Collaborator Contribution | Building on our work in VADA on data discovery, we have been working with Peak on techniques to make the onboarding of customer data more systematic and less labour-intensive. |
Impact | N/A |
Start Year | 2019 |
Title | JupyterLab environment for Vadalog execution |
Description | We have adapted the JupyterLab data science environment to use execute Vadalog, the reasoning language developed in the context of the VADA project. Features include: (1) Rule authoring, execution, (2) Interaction with Python and R, (3) Program analysis and debugging (4) Model Explanations (Proof Trees and Audit Trails), (5) Visualisation |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Impact | This is a main contribution towards adoption of Vadalog in the wider community, as Jupyter has recently become a popular tool in the data science and research community. We can expect researchers to use Vadalog for specific purposes alongside other tools in their toolchain, without having to adapt to a new environment. |
Company Name | DeepReason.ai |
Description | DeepReason.ai develops a 'Knowledge Graph Platform' for organisations, which uses AI to unify data from multiple areas of the business in order to conduct analysis. |
Year Established | 2018 |
Impact | The company has been acquired by Meltwater Inc. in November 2021 https://www.meltwater.com/en/about/press-releases/meltwater-acquires-deepreason-ai |
Website | https://deepreason.ai/ |
Company Name | The Data Value Factory |
Description | The Data Value Factory develops software for cleaning and integrating data. |
Year Established | 2018 |
Impact | The company provides software and services for data preparation. |
Website | http://thedatavaluefactory.com |
Description | Chair, Formal Semantics Working Group of the Linked Data Benchmark Council |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | The Linked Data Benchmark Council is an organisation that arranges work by academics on behalf of graph database vendors as well as groups that produce new standards for graph query languages. Since 2020, Leonid Libkin leads the formal semantics working group that comprises academics from the UK, France, Germany, Poland, and Chile, and that analyses the emerging standard of graph querying called GQL. The group works in close collaboration with companies such as Neo4j (UK/Sweden), Oracle and TigerGraph (US). Its contributions are already reflected in the new part of the SQL standard for querying graphs, SQL/PGQ. |
Year(s) Of Engagement Activity | 2020,2021,2022 |
Description | Data Wrangling for Big Data |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | This workshop included presentations, demonstrations and posters on work that relates to big data wrangling, with a view to sharing best practice and emerging techniques. The presentations included the ongoing work from the VADA partners as well as presentations from industrial experts. The event intended primarily for data scientists and computer scientists from business/industry and academia. It sparked further discussions about the field of data wrangling, extraction, cleaning and reasoning, i.e., the subjects of the VADA project. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.turing.ac.uk/events/data-wrangling-big-data/ |
Description | Invited Speaker at International Conference on Model and Data Engineering |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The Eight International Conference on Model & Data Engineering (MEDI) will be held from 24 to 26 October 2018 in Marrakesh, Morocco. Its main objective is to provide a forum for the dissemination of research accomplishments and to promote the interaction and collaboration between the models and data research communities. MEDI'2018 provides an international platform for the presentation of research on models and data theory, development of advanced technologies related to models and data and their advanced applications. This international scientific event, initiated by researchers from Euro-Mediterranean countries, aims also at promoting the creation of north-south scientific networks, projects and faculty/student exchanges. |
Year(s) Of Engagement Activity | 2018 |
URL | https://easychair.org/cfp/MEDI2018 |
Description | Invited Speaker at International European Conference on Logics in Artificial Intelligence |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The conference is about logic in AI and I gave a talk about Vadalog which is a logic-based reasoning language for modern AI applications, in particular for knowledge graph systems. I presented recent advances and applications, with a focus on the language Vadalog itself. |
Year(s) Of Engagement Activity | 2019 |
URL | https://jelia2019.mat.unical.it/invited-speakers#h.p_446ZRly1aeB1 |
Description | Invited Talk - Lovelace Lecture And Conferment Of The Lovelace Medal |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | 22 March 2018 Lovelace Lecture And Conferment Of The Lovelace Medal to Prof. Gottlob |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.bcs.org/category/19248 |
Description | Invited plenary speaker at International Conference On Scalable Uncertainty Management |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Keynote talk: Swift Logic for Big Data and Knowledge Graphs |
Year(s) Of Engagement Activity | 2018 |
URL | http://www.ir.disco.unimib.it/sum2018/invited-speakers/ |
Description | Keynote at Austrian Computer Science Day |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | I talked about m adventures with Datalog, walking the thin line between theory and practice. |
Year(s) Of Engagement Activity | 2017,2019 |
URL | https://acsd2019.ai.wu.ac.at/timetable/event/georg-gottlob/ |
Description | Lecture at the Samsung Cambridge Research Centre |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Lecture emtotltled: A Journey from Web Data Extraction over Knowledge Graphs towards Integrating Rules with Machine Learning Abstract: This talk first reports about DIADEM, a past ERC-funded project at Oxford University for fully-automated domain-specific web data extraction (http://diadem.cs.ox.ac.uk/). DIADEM loosely integrates machine learning (ML) tasks with transferable rule-based knowledge. This project was very successful and gave rise to a spin-out company. The ML-knowledge integration in Diadem was ad hoc and problem-specific, and therefore, in a follow-up project we designed the VADALOG knowledge graph management system that allows Engineers to realize applications in various areas that make use of a loose integration of rule-based knowledge and ML. Another goal of VADALOG is efficient and expressive reasoning over Big Data. VADALOG was designed in the context of the EPSRC Program Grant VADA (Value-Added Data Systems; https://vada.org.uk/). We describe the language and principles underlying VADALOG, and discuss some applications of VADALOG developed by the DeepReason.ai VADA spin-out (founded in 2018). The journey does not finish here. In the context of the new RAISON DATA project, we aim at a much tighter integration of ML with rule-based knowledge. We will give some motivations and present our initial approach en route to a new type of system. |
Year(s) Of Engagement Activity | 2020 |
Description | Member of the SQL Standard ISO Committee (officially: ISO/IEC JTC1 SC32 WG3) |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | SQL is the main language of relational database systems, used by practically all business and governmental organizations. It is standardized by ISO (International Organization for Standardization). Since 2018, Leonid Libkin is a member of that committee, currently one of only 4 academics influencing the design of this ubiquitous query languages in such areas as handling graph queries and incomplete data. |
Year(s) Of Engagement Activity | 2018,2019,2020,2021,2022 |
Description | Milner Lecture, at Edinburgh University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Milner Lecture 2018: Swift Logic for Big Data and Knowledge Graphs |
Year(s) Of Engagement Activity | 2018 |
URL | http://wcms.inf.ed.ac.uk/lfcs/events/swift-logic-for-big-data-and-knowledge-graphs |
Description | The EDBT Summer School |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The EDBT Summer School brings together leading researchers recognised as experts in their fields and provides participants the opportunity to gain deeper insight into current research trends in the database area. In 2017, the theme of the school will be "Adding Value to Data". The scientific topics will cover principles and solutions for adding value to data, that is, for supporting users in discovering, extracting, integrating, accessing and interpreting the data of relevance to their questions, while taking into account the role of the crowd and the impact of dirty data as well as the adoption of responsible data management and analysis processes. The school will be organised around 7 main themes. The 2017 summer school follows the successful structure of previous EDBT schools: stimulating lectures by leading researchers in the field (two for each main theme), groupwork on assignment, and a lively scientific and social program. |
Year(s) Of Engagement Activity | 2017 |
Description | invited plenary speaker at RuleML+RR |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | invited plenary speaker at RuleML+RR |
Year(s) Of Engagement Activity | 2018 |