Automating Fact Checking using Embeddings
Lead Research Organisation:
University of Sheffield
Department Name: Computer Science
Abstract
Fact checking is the task of assessing truthfulness of publicly made claims. In a journalistic context, this task is costly and time-consuming: This task often requires researchers to spend anywhere between an hour and a day to validate information from a number of sources prior to publication. While crowd-sourcing these databases is an option, these supervised approaches do not lend themselves to scale to open-domain fact checking.
An accurate, fully automated fact checker will eliminate the need for human intervention and has long been considered the Holy-Grail for this task. Automated fact checking can be decomposed into two subtasks, which build upon the Information Extraction (IE) and Question Answering (QA) domains.
Firstly, a knowledge-base must be constructed. Current state-of-the-art IE techniques exploit Logical Tensor Calculus to express a semantic model but still require some degree of supervision.
Further research is required to both reduce supervision and to allow for online learning of new facts. Secondly, input `facts' to be checked must be translated into a suitable query form that allows for retrieval and scoring using the knowledge-base. Popular approaches either cluster question forms to extract entities for query or use subgraph embeddings for matching. Whether embeddings generated using tensor models are suitable for automated fact checking and QA is an unanswered question.
An accurate, fully automated fact checker will eliminate the need for human intervention and has long been considered the Holy-Grail for this task. Automated fact checking can be decomposed into two subtasks, which build upon the Information Extraction (IE) and Question Answering (QA) domains.
Firstly, a knowledge-base must be constructed. Current state-of-the-art IE techniques exploit Logical Tensor Calculus to express a semantic model but still require some degree of supervision.
Further research is required to both reduce supervision and to allow for online learning of new facts. Secondly, input `facts' to be checked must be translated into a suitable query form that allows for retrieval and scoring using the knowledge-base. Popular approaches either cluster question forms to extract entities for query or use subgraph embeddings for matching. Whether embeddings generated using tensor models are suitable for automated fact checking and QA is an unanswered question.
Organisations
People |
ORCID iD |
Andreas Vlachos (Primary Supervisor) | |
James Thorne (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/N509735/1 | 01/10/2016 | 30/09/2021 | |||
1905693 | Studentship | EP/N509735/1 | 26/09/2016 | 26/01/2020 | James Thorne |