AI and Cognitive Computing for Reasoning about Big Data and Knowledge Graphs with Application to the Oil and Gas Industry
Lead Research Organisation:
University of Oxford
Department Name: Computer Science
Abstract
Brief Description: The broad aim of this doctoral project is to gain, in cooperation with BP, a deep understanding of how the latest AI and cognitive computing technologies can be used for reasoning over big data. One of the application of this project is in the oil and gas industry - supporting and improving core business processes and decision making in this sector. Towards this aim, the significant contributions of the student are planned to be the development of a rule learner and reasoner for big data, specific designed of the learner for existential rules, and systematic evaluation its impact on applications important to BP.
To the mutual benefit of the student, BP, and Oxford University, and to maximise synergies, the studentship will be attached to the VADA "Value Added Data Systems" project which - as one of its goals - aims to develop a general-purpose reasoning system, building on the experience with the Datalog family of languages.
Alignment to EPSRC's Strategies and Research Areas: This project falls within the EPSRC Information and communication technologies (ICT) theme, and the following research areas: Artificial intelligence technologies, Databases, and Information systems.
Novelty and Research Methodology: Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and are used in a number of systems. Since developing such rules is time-consuming and requires scarce expert knowledge, it is essential to develop algorithms for learning such rules. This project addresses the problem of learning existential rules, which found applications in many uses cases such as Knowledge Graphs, the Semantic Web and Web Data Extraction. In particular, we concentrate on developing evolutionary learning algorithms for existential rules. We define the rule learning setting and review the main approaches to learning rules, such as top-down, bottom-up, and neural methods. We review existing evolutionary approaches to rule learning, discuss different genetic encoding schema, initial population creation methods, evolution operators, and evaluation fitness functions.
In addition, from a wider view, we explore four interaction models between logical reasoning engines and Machine Learning approaches. Last but not least, we outline the answers to the proposed research questions for existential rule learning with promising experimental results, and exhibit applications in crude corrosivity, knowledge graph rule mining and question answering data sets. This project focuses on studying Evolutionary Algorithms (EA, also known as Genetic Algorithm, GA) for the Inductive Logic Programming (ILP) problem. EAs are a family of biology-inspired search algorithms that optimize for the most promising preliminary solutions, while exploring a wide search space at the same time. In particular, in our setting, atoms, partial or parameterized rules can be treated as chromosomes, from which the new population of chromosomes can be derived via the operations of mutation, crossover and selection. While performing these operations, a quality measure, called fitness function is computed to judge whether an obtained new generation of chromosomes is fit for continuing the search. Such evolutionary algorithms typically do not perform exhaustive search and at the same time are less likely to fall into local optima. In addition, they are flexible in that they might not require imposed template on the shape of rules, as it is typically the case in other approaches to ILP.
Companies and Collaborators Involved: This is an EPSRC Industrial CASE studentship project in collaboration with BP.
To the mutual benefit of the student, BP, and Oxford University, and to maximise synergies, the studentship will be attached to the VADA "Value Added Data Systems" project which - as one of its goals - aims to develop a general-purpose reasoning system, building on the experience with the Datalog family of languages.
Alignment to EPSRC's Strategies and Research Areas: This project falls within the EPSRC Information and communication technologies (ICT) theme, and the following research areas: Artificial intelligence technologies, Databases, and Information systems.
Novelty and Research Methodology: Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and are used in a number of systems. Since developing such rules is time-consuming and requires scarce expert knowledge, it is essential to develop algorithms for learning such rules. This project addresses the problem of learning existential rules, which found applications in many uses cases such as Knowledge Graphs, the Semantic Web and Web Data Extraction. In particular, we concentrate on developing evolutionary learning algorithms for existential rules. We define the rule learning setting and review the main approaches to learning rules, such as top-down, bottom-up, and neural methods. We review existing evolutionary approaches to rule learning, discuss different genetic encoding schema, initial population creation methods, evolution operators, and evaluation fitness functions.
In addition, from a wider view, we explore four interaction models between logical reasoning engines and Machine Learning approaches. Last but not least, we outline the answers to the proposed research questions for existential rule learning with promising experimental results, and exhibit applications in crude corrosivity, knowledge graph rule mining and question answering data sets. This project focuses on studying Evolutionary Algorithms (EA, also known as Genetic Algorithm, GA) for the Inductive Logic Programming (ILP) problem. EAs are a family of biology-inspired search algorithms that optimize for the most promising preliminary solutions, while exploring a wide search space at the same time. In particular, in our setting, atoms, partial or parameterized rules can be treated as chromosomes, from which the new population of chromosomes can be derived via the operations of mutation, crossover and selection. While performing these operations, a quality measure, called fitness function is computed to judge whether an obtained new generation of chromosomes is fit for continuing the search. Such evolutionary algorithms typically do not perform exhaustive search and at the same time are less likely to fall into local optima. In addition, they are flexible in that they might not require imposed template on the shape of rules, as it is typically the case in other approaches to ILP.
Companies and Collaborators Involved: This is an EPSRC Industrial CASE studentship project in collaboration with BP.
People |
ORCID iD |
Georg Gottlob (Primary Supervisor) | |
Lianlong Wu (Student) |
Publications
Wu L
(2019)
Evolutionary Learning of Existential Rules
Song Y
(2020)
Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence
in Proceedings of the AAAI Conference on Artificial Intelligence
Bellomarini L
(2022)
Data science with Vadalog: Knowledge Graphs with machine learning and reasoning in practice
in Future Generation Computer Systems
Tigga NP
(2023)
Efficacy of novel attention-based gated recurrent units transformer for depression detection using electroencephalogram signals.
in Health information science and systems
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/P510609/1 | 01/10/2016 | 30/09/2021 | |||
2370505 | Studentship | EP/P510609/1 | 01/10/2017 | 29/09/2021 | Lianlong Wu |
Description | This research successfully produced a purely symbolic learning algorithm that meet the original objectives, to faciliate explainable knowledge retention and discovery. The outcomes have been published in academic journal and top conference, and being applied with real world knowledge graph data as well as industrial production data. |
Exploitation Route | Academic publications that could be further development by other researchers. And industrial prototype could inspire further real world applications. |
Sectors | Education,Energy,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology |
Description | The research works on general knowledge and data, which is applied to all areas. Especially, the research unveal the power of explainable symbolic knowledge discovery with high precision for practical use. |
First Year Of Impact | 2021 |
Sector | Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology |
Impact Types | Economic |
Description | Alibaba Cloud (UK) research fund for large scale knowledge graph challenges |
Amount | $20,000 (USD) |
Organisation | Alibaba Group |
Sector | Private |
Country | China |
Start | 12/2022 |
Description | IJCAI 2019 Doctoral Consortium Travel Award |
Amount | € 300 (EUR) |
Organisation | International Joint Conferences on Artificial Intelligence |
Sector | Charity/Non Profit |
Country | United States |
Start | 08/2019 |
End | 08/2019 |
Title | Evoda Algorithm |
Description | In addition, from a wider view, we explore four interaction models between logical reasoning engines and Machine Learning approaches. Last but not least, we outline the answers to the proposed research questions for existential rule learning with promising experimental results, and exhibit applications in crude corrosivity, knowledge graph rule mining and question answering data sets. This project focuses on studying Evolutionary Algorithms (EA, also known as Genetic Algorithm, GA) for the Inductive Logic Programming (ILP) problem. EAs are a family of biology-inspired search algorithms that optimize for the most promising preliminary solutions, while exploring a wide search space at the same time. In particular, in our setting, atoms, partial or parameterized rules can be treated as chromosomes, from which the new population of chromosomes can be derived via the operations of mutation, crossover and selection. While performing these operations, a quality measure, called fitness function is computed to judge whether an obtained new generation of chromosomes is fit for continuing the search. Such evolutionary algorithms typically do not perform exhaustive search and at the same time are less likely to fall into local optima. In addition, they are flexible in that they might not require imposed template on the shape of rules, as it is typically the case in other approaches to ILP. |
Type Of Material | Computer model/algorithm |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | It is published on ICDE which is the top venue for the data science research conference. |
Description | BP Collaboration Industrial CASE studentship project |
Organisation | BP (British Petroleum) |
Country | United Kingdom |
Sector | Private |
PI Contribution | Novelty and Research Methodology: Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and are used in a number of systems. Since developing such rules is time-consuming and requires scarce expert knowledge, it is essential to develop algorithms for learning such rules. This project addresses the problem of learning existential rules, which found applications in many uses cases such as Knowledge Graphs, the Semantic Web and Web Data Extraction. In particular, we concentrate on developing evolutionary learning algorithms for existential rules. We define the rule learning setting and review the main approaches to learning rules, such as top-down, bottom-up, and neural methods. We review existing evolutionary approaches to rule learning, discuss different genetic encoding schema, initial population creation methods, evolution operators, and evaluation fitness functions. |
Collaborator Contribution | BP provide detailed site visiting and in-depth discussion with various business teams. We are in closed connection with refinary engineering team, and they have supplied sufficient labortory and production monitoring data to faciliate our pilot research activities. |
Impact | We have delivered prototyping algorithm and models for the sulfur level prediction in refinary production. We have delivered several talks about Knowledge Graph and industrial application talks and workshops. |
Start Year | 2017 |
Description | Innovation panel moderator at PKUAA-UK 2023 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Peking University Alumni Association in the United Kingdom (PKUAA-UK) annual conference innovation panel moderator, conference orgnization team member. |
Year(s) Of Engagement Activity | 2023 |
URL | http://pkuaa.org.uk/about |
Description | Invited talk at KRW2022 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited talk at 9th International Chinese Scholar Workshop on Knowledge Representation and Reasoning(KRW-2022) |
Year(s) Of Engagement Activity | 2022 |
URL | http://home.cse.ust.hk/~flin/krw2022/program.html |
Description | Invited talk at School of Informatics, University of Edinburgh |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Invited reserach talk at School of Informatics, University of Edinburgh. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.ed.ac.uk/informatics |
Description | Organize ECAI 2020 KR4L Workshop |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Acted as organizers for ECAI 2020 KR4L workshop, involved in publicity affairs, including the call for paper, social media marketing, inviting programme committee members, assignment of reviews, paper reviews and meta-reviews. The experience as organizers of the academic conference provides a unique opportunity for my understanding of academic evaluation and communications. More than 50 participants attended the online workshop, and we have 8 talks and publications on CEUR-WS.org. The workshop 'Knowledge Representation & Representation Learning (KR4L)' will be held in conjunction with the 24th European Conference on Artificial Intelligence (ECAI 2020). There currently is a perceived disconnect between the areas of Representation Learning (RL) and Knowledge Representation and Reasoning (KRR). Most of the research is currently concentrated on one area or the other, yet arguably representation learning is central to make use of knowledge representation and reasoning techniques in modern, scalable AI applications. This is particularly the case, but not restricted to, the area of Knowledge Graphs. |
Year(s) Of Engagement Activity | 2020 |
URL | https://smartdataanalytics.github.io/KR4L/ |
Description | Program Committee member of AAAI 2021 |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Acted as Program Committee member of AAAI 2021, participant in paper reviewing and discussions etc. Reviewed 5 papers in total. 2 of them finally published in AAAI 2021. |
Year(s) Of Engagement Activity | 2021,2022 |
URL | https://aaai.org/Conferences/AAAI-21/ |
Description | Program Committee member of EDBT 2022 Workshop |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Knowledge Graphs are a recent and promising incarnation of database methodologies and technology, which is attracting increasing use within domains characterized by the presence of many interconnected entities, interacting via complex dynamics. While the convergence towards a consolidated definition has not been reached yet, underlying the different notions of KGs, there is the use of graph-based data models and systems in complex domains, where the need for handling and operationalizing specific and frequently complex domain knowledge calls for smart Knowledge Representation and Reasoning (KRR) paradigms and solutions, including logic-based reasoning, graph embeddings, graph neural networks, probabilistic reasoning, handling of uncertainty, modeling of temporal graphs, and many more. Among the broad variety of fields where KGs are finding use and adoption, their impact on the economic and financial sector will be undoubtedly a long-lasting one, due to a close fit between technology and business, as witnessed by: i) the presence of large or extreme-scale stores of economic and financial data with inherent network structure; ii) the natural emergence of complex economic, financial and more in general societal network dynamics to be modeled and captured; iii) an articulated regulatory body that defines the interactions between the involved entities (e.g., the Basel III regulation, the European Central Bank legal frameworks covering many fields such as prudential supervision of credit institutions, the Investment Firm Directive, the MiFID/MiFIR and PSD2 directives, etc.). EcoFinKG wants to reduce the distance between the database and economics/finance communities, sustaining new research-backed economic and financial applications that awarely use and demystify state-of-the-art data technology. |
Year(s) Of Engagement Activity | 2022 |
URL | https://ecofinkg22.knowledgegraph.science/ |
Description | Program Committee member of IJCAI2023 |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | PC member of IJCAI2023 for paper reviewing and conference orginzations etc. |
Year(s) Of Engagement Activity | 2023 |
URL | https://ijcai-23.org/ |