AI and Cognitive Computing for Reasoning about Big Data and Knowledge Graphs with Application to the Oil and Gas Industry

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

Brief Description: The broad aim of this doctoral project is to gain, in cooperation with BP, a deep understanding of how the latest AI and cognitive computing technologies can be used for reasoning over big data. One of the application of this project is in the oil and gas industry - supporting and improving core business processes and decision making in this sector. Towards this aim, the significant contributions of the student are planned to be the development of a rule learner and reasoner for big data, specific designed of the learner for existential rules, and systematic evaluation its impact on applications important to BP.
To the mutual benefit of the student, BP, and Oxford University, and to maximise synergies, the studentship will be attached to the VADA "Value Added Data Systems" project which - as one of its goals - aims to develop a general-purpose reasoning system, building on the experience with the Datalog family of languages.
Alignment to EPSRC's Strategies and Research Areas: This project falls within the EPSRC Information and communication technologies (ICT) theme, and the following research areas: Artificial intelligence technologies, Databases, and Information systems.
Novelty and Research Methodology: Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and are used in a number of systems. Since developing such rules is time-consuming and requires scarce expert knowledge, it is essential to develop algorithms for learning such rules. This project addresses the problem of learning existential rules, which found applications in many uses cases such as Knowledge Graphs, the Semantic Web and Web Data Extraction. In particular, we concentrate on developing evolutionary learning algorithms for existential rules. We define the rule learning setting and review the main approaches to learning rules, such as top-down, bottom-up, and neural methods. We review existing evolutionary approaches to rule learning, discuss different genetic encoding schema, initial population creation methods, evolution operators, and evaluation fitness functions.
In addition, from a wider view, we explore four interaction models between logical reasoning engines and Machine Learning approaches. Last but not least, we outline the answers to the proposed research questions for existential rule learning with promising experimental results, and exhibit applications in crude corrosivity, knowledge graph rule mining and question answering data sets. This project focuses on studying Evolutionary Algorithms (EA, also known as Genetic Algorithm, GA) for the Inductive Logic Programming (ILP) problem. EAs are a family of biology-inspired search algorithms that optimize for the most promising preliminary solutions, while exploring a wide search space at the same time. In particular, in our setting, atoms, partial or parameterized rules can be treated as chromosomes, from which the new population of chromosomes can be derived via the operations of mutation, crossover and selection. While performing these operations, a quality measure, called fitness function is computed to judge whether an obtained new generation of chromosomes is fit for continuing the search. Such evolutionary algorithms typically do not perform exhaustive search and at the same time are less likely to fall into local optima. In addition, they are flexible in that they might not require imposed template on the shape of rules, as it is typically the case in other approaches to ILP.
Companies and Collaborators Involved: This is an EPSRC Industrial CASE studentship project in collaboration with BP.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/P510609/1 01/10/2016 30/09/2021
2370505 Studentship EP/P510609/1 01/10/2017 29/09/2021 Lianlong Wu
 
Description This research successfully produced a purely symbolic learning algorithm that meet the original objectives, to faciliate explainable knowledge retention and discovery. The outcomes have been published in academic journal and top conference, and being applied with real world knowledge graph data as well as industrial production data.
Exploitation Route Academic publications that could be further development by other researchers. And industrial prototype could inspire further real world applications.
Sectors Education,Energy,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology

 
Description The research works on general knowledge and data, which is applied to all areas. Especially, the research unveal the power of explainable symbolic knowledge discovery with high precision for practical use.
First Year Of Impact 2021
Sector Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology
Impact Types Economic

 
Description Alibaba Cloud (UK) research fund for large scale knowledge graph challenges
Amount $20,000 (USD)
Organisation Alibaba Group 
Sector Private
Country China
Start 12/2022 
 
Description IJCAI 2019 Doctoral Consortium Travel Award
Amount € 300 (EUR)
Organisation International Joint Conferences on Artificial Intelligence 
Sector Charity/Non Profit
Country United States
Start 08/2019 
End 08/2019
 
Title Evoda Algorithm 
Description In addition, from a wider view, we explore four interaction models between logical reasoning engines and Machine Learning approaches. Last but not least, we outline the answers to the proposed research questions for existential rule learning with promising experimental results, and exhibit applications in crude corrosivity, knowledge graph rule mining and question answering data sets. This project focuses on studying Evolutionary Algorithms (EA, also known as Genetic Algorithm, GA) for the Inductive Logic Programming (ILP) problem. EAs are a family of biology-inspired search algorithms that optimize for the most promising preliminary solutions, while exploring a wide search space at the same time. In particular, in our setting, atoms, partial or parameterized rules can be treated as chromosomes, from which the new population of chromosomes can be derived via the operations of mutation, crossover and selection. While performing these operations, a quality measure, called fitness function is computed to judge whether an obtained new generation of chromosomes is fit for continuing the search. Such evolutionary algorithms typically do not perform exhaustive search and at the same time are less likely to fall into local optima. In addition, they are flexible in that they might not require imposed template on the shape of rules, as it is typically the case in other approaches to ILP. 
Type Of Material Computer model/algorithm 
Year Produced 2022 
Provided To Others? Yes  
Impact It is published on ICDE which is the top venue for the data science research conference. 
 
Description BP Collaboration Industrial CASE studentship project 
Organisation BP (British Petroleum)
Country United Kingdom 
Sector Private 
PI Contribution Novelty and Research Methodology: Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and are used in a number of systems. Since developing such rules is time-consuming and requires scarce expert knowledge, it is essential to develop algorithms for learning such rules. This project addresses the problem of learning existential rules, which found applications in many uses cases such as Knowledge Graphs, the Semantic Web and Web Data Extraction. In particular, we concentrate on developing evolutionary learning algorithms for existential rules. We define the rule learning setting and review the main approaches to learning rules, such as top-down, bottom-up, and neural methods. We review existing evolutionary approaches to rule learning, discuss different genetic encoding schema, initial population creation methods, evolution operators, and evaluation fitness functions.
Collaborator Contribution BP provide detailed site visiting and in-depth discussion with various business teams. We are in closed connection with refinary engineering team, and they have supplied sufficient labortory and production monitoring data to faciliate our pilot research activities.
Impact We have delivered prototyping algorithm and models for the sulfur level prediction in refinary production. We have delivered several talks about Knowledge Graph and industrial application talks and workshops.
Start Year 2017
 
Description Innovation panel moderator at PKUAA-UK 2023 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Peking University Alumni Association in the United Kingdom (PKUAA-UK) annual conference innovation panel moderator, conference orgnization team member.
Year(s) Of Engagement Activity 2023
URL http://pkuaa.org.uk/about
 
Description Invited talk at KRW2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk at 9th International Chinese Scholar Workshop on Knowledge Representation and Reasoning(KRW-2022)
Year(s) Of Engagement Activity 2022
URL http://home.cse.ust.hk/~flin/krw2022/program.html
 
Description Invited talk at School of Informatics, University of Edinburgh 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Invited reserach talk at School of Informatics, University of Edinburgh.
Year(s) Of Engagement Activity 2023
URL https://www.ed.ac.uk/informatics
 
Description Organize ECAI 2020 KR4L Workshop 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Acted as organizers for ECAI 2020 KR4L workshop, involved in publicity affairs, including the call for paper, social media marketing, inviting programme committee members, assignment of reviews, paper reviews and meta-reviews. The experience as organizers of the academic conference provides a unique opportunity for my understanding of academic evaluation and communications. More than 50 participants attended the online workshop, and we have 8 talks and publications on CEUR-WS.org.

The workshop 'Knowledge Representation & Representation Learning (KR4L)' will be held in conjunction with the 24th European Conference on Artificial Intelligence (ECAI 2020). There currently is a perceived disconnect between the areas of Representation Learning (RL) and Knowledge Representation and Reasoning (KRR). Most of the research is currently concentrated on one area or the other, yet arguably representation learning is central to make use of knowledge representation and reasoning techniques in modern, scalable AI applications. This is particularly the case, but not restricted to, the area of Knowledge Graphs.
Year(s) Of Engagement Activity 2020
URL https://smartdataanalytics.github.io/KR4L/
 
Description Program Committee member of AAAI 2021 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Acted as Program Committee member of AAAI 2021, participant in paper reviewing and discussions etc. Reviewed 5 papers in total. 2 of them finally published in AAAI 2021.
Year(s) Of Engagement Activity 2021,2022
URL https://aaai.org/Conferences/AAAI-21/
 
Description Program Committee member of EDBT 2022 Workshop 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Knowledge Graphs are a recent and promising incarnation of database methodologies and technology, which is attracting increasing use within domains characterized by the presence of many interconnected entities, interacting via complex dynamics.

While the convergence towards a consolidated definition has not been reached yet, underlying the different notions of KGs, there is the use of graph-based data models and systems in complex domains, where the need for handling and operationalizing specific and frequently complex domain knowledge calls for smart Knowledge Representation and Reasoning (KRR) paradigms and solutions, including logic-based reasoning, graph embeddings, graph neural networks, probabilistic reasoning, handling of uncertainty, modeling of temporal graphs, and many more.

Among the broad variety of fields where KGs are finding use and adoption, their impact on the economic and financial sector will be undoubtedly a long-lasting one, due to a close fit between technology and business, as witnessed by: i) the presence of large or extreme-scale stores of economic and financial data with inherent network structure; ii) the natural emergence of complex economic, financial and more in general societal network dynamics to be modeled and captured; iii) an articulated regulatory body that defines the interactions between the involved entities (e.g., the Basel III regulation, the European Central Bank legal frameworks covering many fields such as prudential supervision of credit institutions, the Investment Firm Directive, the MiFID/MiFIR and PSD2 directives, etc.).

EcoFinKG wants to reduce the distance between the database and economics/finance communities, sustaining new research-backed economic and financial applications that awarely use and demystify state-of-the-art data technology.
Year(s) Of Engagement Activity 2022
URL https://ecofinkg22.knowledgegraph.science/
 
Description Program Committee member of IJCAI2023 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact PC member of IJCAI2023 for paper reviewing and conference orginzations etc.
Year(s) Of Engagement Activity 2023
URL https://ijcai-23.org/