Neural Event Extraction from Scientific Literature

Lead Research Organisation: University of Manchester

Department Name: Computer Science

Abstract

The main objective of the thesis is to investigate event extraction models able to extract multiple structures from text. Although many neural models have been proposed for event extraction, most have focused on flat events and flat entities. State-of-the art approaches to event extraction are pipeline systems, which independently addresses small tasks in sequence, to solve a more complex task. However, sequential processing may result in suboptimal overall performance when compared to approaching the subtasks in a joint manner. This can be attributed to the fact that errors resulting from any of a pipeline's early subtasks are propagated to the rest of the subtasks in the series. Also, pipeline-based approaches fail to capture inter-dependencies amongst the different subtasks. As a result, elements that are separately dealt with in a pipeline (e.g., named entities and events) cannot be used for mutual disambiguation. Only a handful of event extraction approaches have employed embeddings, and using only lexical and syntactic features of words.
We will investigate novel neural network structured prediction models for event extraction, building on top of the contextual embeddings and simultaneously detecting entity/triggers, role layers, trigger-argument pairs and event layers. Our aim is to maximise both precision and high-coverage. By decomposing deep information extraction into separate components, we can consider each one as an individual system to support easy connectivity between components. The models of these components will support joint learning, thus the individual models have to share representations. To achieve high precision we will deploy joint training between different components of the event extraction pipeline and we will incorporate multi-task learning with other tasks, e.g. POS tagging, language models. We will also deal with complex data representations such as nested and disjoint entities, triggers and events and take into account semantic relational structure between words to improve performance. To achieve high coverage, we will investigate joint representation of linguistic units such as characters, sub-words, phrases and multi-word expressions which will enable more flexible and joint learning from external manually curated resources such as knowledge bases, ontologies, lexica and databases.

Student:

Panagiotis Georgiadis

Period of Study:

Sep 19 - Sep 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2324763

Research Topic:

Unclassified

Organisations

People	ORCID iD
Panagiotis Georgiadis (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513131/1			30/09/2018	29/09/2023
2324763	Studentship	EP/R513131/1	30/09/2019	29/09/2023	Panagiotis Georgiadis
EP/T517823/1			30/09/2020	29/09/2025
2324763	Studentship	EP/T517823/1	30/09/2019	29/09/2023	Panagiotis Georgiadis

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects