ConCur: Knowledge Base Construction and Curation

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

Knowledge graphs are graph-structured knowledge resources which are often expressed as triples such as ("UK", "hasCapital", "London") and ("London", "instanceOf", "City"). As well as such basic "facts", knowledge graphs often include structural knowledge about the domain, typically based on a hierarchy of entity types (AKA classes or concepts); e.g., ("City", "subClassOf", "HumanSettlement"). A knowledge graph that consist largely or wholly of structural knowledge is often called an ontology.

Some knowledge graphs are general purpose, such as Wikidata and the Google knowledge graph, while others are developed for specific domains such as medicine. They are rapidly gaining in importance and are playing a key role in many applications. For example, Google uses its knowledge graph for search, question answering and Google Assistant, while Amazon and Apple also use knowledge graphs to power their personal assistants Alexa and Siri, respectively. Knowledge graphs are widely used in the domain of health and wellbeing, e.g., for organising and exchanging information and to power clinical artificial intelligence (AI). One example is FoodOn, an ontology representing food knowledge such as fine-grained food product categorization, nutrition and allergens, as well as related activities such as agriculture.

Knowledge graph construction and maintenance is, however, very challenging, and may require a considerable amount of human effort. Notwithstanding the high cost of knowledge creation, knowledge graphs are often still biased, incomplete or too coarse-grained. Take HeLis, an ontology for health and lifestyle, as an example. Its food knowledge is quite simple and often represents many different variants with a single entity (e.g., "Banana" for all kinds and derivatives of bananas), and its knowledge of health is highly incomplete when compared with dedicated biomedical ontologies. In addition, it is hard to avoid errors such as incorrect facts and categorisations in knowledge graphs; e.g., FoodOn categorises soy milk as a kind of milk, but not as a kind of soy product. Such errors may be inherited from the information source or be caused by the construction procedure. These issues significantly impact the usefulness of knowledge graphs and the reliability of the systems that use them; e.g., the categorisation of soy milk could be dangerous if the knowledge graph were used in a food allergen alert system.

Therefore, effective knowledge graph construction and curation is urgently required and will play a critical role in exploiting the full value of knowledge graphs. As there are now many available knowledge resources, one possible approach is to use multiple sources to address both coverage and quality issues, e.g., via integration and cross-checking. For example, integrating HeLis with FoodOn would combine fine-grained categorization of food products (including bananas) with lifestyle knowledge. Moreover, cross-checking FoodOn with HeLis will reveal the problem with soy milk, which is correctly categorized as a soy product in HeLis. Automating the integration of knowledge resources is challenging, but combining semantic and learning-based techniques seems to be a very promising approach, and we have already obtained some encouraging preliminary results in this direction.

The proposed research will therefore study a range of semantic and machine learning techniques, and how to combine them to support knowledge graph construction and curation. As well as its application to knowledge graph construction and curation, this research will also contribute to the development of new neural-symbolic theories, paradigms and methods, such as deep semantic embedding for learning representations for expressive knowledge, and knowledge-guided learning for addressing sample shortage problems. These techniques promise to revolutionize many AI and big data technologies.
 
Description Collaboration with Bosch 
Organisation Bosch Group
Department Bosch
Country Germany 
Sector Private 
PI Contribution PhD research
Collaborator Contribution Real-life problems and funding for PhD student
Impact PhD funding
Start Year 2021
 
Description Collaboration with Samsung Research UK 
Organisation Samsung
Department Samsung, UK
Country United Kingdom 
Sector Private 
PI Contribution Collaboration with Samsung Research UK
Collaborator Contribution Research problems and funding for PhD students and PDRAs
Impact Publications and funding
Start Year 2019
 
Description Collaboration with Siemens 
Organisation Siemens AG
Country Germany 
Sector Private 
PI Contribution PhD research
Collaborator Contribution Real-life problems and funding for PhD student
Impact PhD funding
Start Year 2019
 
Description Keynote talk at Declarative AI conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Keynote at DeclarativeAI conference about our research and spin-out activities on knowledge graphs
Year(s) Of Engagement Activity 2022
 
Description Keynote talk at LDAC conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Keynote at LDAC to present our research and spin-out activities on knowledge graphs
Year(s) Of Engagement Activity 2022
 
Description Presentation at Huawei 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Talk at Huawei to inform them about our research and spin-out activities on knowledge graphs
Year(s) Of Engagement Activity 2022
 
Description Presentation at SAP 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Talk at SAP to inform them about our research and spin-out activities on knowledge graphs
Year(s) Of Engagement Activity 2022