eNeMILP: Non-Monotonic Incremental Language Processing
Lead Research Organisation:
University of Cambridge
Department Name: Computer Science and Technology
Abstract
Research in natural language processing (NLP) is driving advances in many applications such as search engines and personal digital assistants, e.g. Apple's Siri and Amazon's Alexa. In many NLP tasks the output to be predicted is a graph representing the sentence, e.g. a syntax tree in syntactic parsing or a meaning representation in semantic parsing. Furthermore, in other tasks such as natural language generation and machine translation the predicted output is text, i.e. a sequence of words. Both types of NLP tasks have been tackled successfully with incremental modelling approaches in which prediction is decomposed into a sequence of actions constructing the output.
Despite its success, a fundamental limitation in incremental modelling is that the actions considered typically construct the output monotonically, e.g. in natural language generation each action adds a word to the output but never removes or changes a previously predicted one. Thus, relying exclusively on monotonic actions can decrease accuracy, since the effect of incorrect actions cannot be amended. Furthermore, these actions will be used to predict the following ones, likely to result in an error cascade.
We propose an 18-month project to address this limitation and learn non-monotonic incremental language processing models, i.e. incremental models that consider actions that can "undo" the outcome of previously predicted ones. The challenge in incorporating non-monotonic actions is that, unlike their monotonic counterparts, they are not straightforward to infer from the labelled data typically available for training, thus rendering standard supervised learning approaches inapplicable. To overcome this issue we will develop novel algorithms under the imitation learning paradigm to learn non-monotonic incremental models without assuming action-level supervision, relying instead on instance-level loss functions and the model's own predictions in order to learn how to recover from incorrect actions to avoid error cascades.
To succeed in this goal, this proposal has the following research objectives:
1) To model non-monotonic incremental prediction of structured outputs in a generic way that can be applied to a variety of tasks with natural language text as output
2) To learn non-monotonic incremental predictors using imitation learning and improve upon the accuracy of monotonic incremental models both in terms of automatic measures such as BLEU and human evaluation.
3) To extend the proposed approach to structured prediction tasks with graph as output.
4) To release software implementations of the proposed methods to facilitate reproducibility and wider adoption by the research community.
The research proposed focuses on a fundamental limitation in incremental language processing models, which have been successfully applied to a variety of natural language processing tasks, thus we anticipate the proposal to have a wide academic impact. Furthermore, the tasks we will evaluate it on, namely natural language generation and semantic parsing, are essential components to natural language interfaces and personal digital assistants. Improving these technologies will enhance accessibility to digital information and services. We will demonstrate the benefits of our approach through our collaboration with our project partners Amazon who are supporting the proposal both in terms of cloud computing credits but also by hosting the research associate in order to apply the outcomes of the project to industry-scale datasets.
Despite its success, a fundamental limitation in incremental modelling is that the actions considered typically construct the output monotonically, e.g. in natural language generation each action adds a word to the output but never removes or changes a previously predicted one. Thus, relying exclusively on monotonic actions can decrease accuracy, since the effect of incorrect actions cannot be amended. Furthermore, these actions will be used to predict the following ones, likely to result in an error cascade.
We propose an 18-month project to address this limitation and learn non-monotonic incremental language processing models, i.e. incremental models that consider actions that can "undo" the outcome of previously predicted ones. The challenge in incorporating non-monotonic actions is that, unlike their monotonic counterparts, they are not straightforward to infer from the labelled data typically available for training, thus rendering standard supervised learning approaches inapplicable. To overcome this issue we will develop novel algorithms under the imitation learning paradigm to learn non-monotonic incremental models without assuming action-level supervision, relying instead on instance-level loss functions and the model's own predictions in order to learn how to recover from incorrect actions to avoid error cascades.
To succeed in this goal, this proposal has the following research objectives:
1) To model non-monotonic incremental prediction of structured outputs in a generic way that can be applied to a variety of tasks with natural language text as output
2) To learn non-monotonic incremental predictors using imitation learning and improve upon the accuracy of monotonic incremental models both in terms of automatic measures such as BLEU and human evaluation.
3) To extend the proposed approach to structured prediction tasks with graph as output.
4) To release software implementations of the proposed methods to facilitate reproducibility and wider adoption by the research community.
The research proposed focuses on a fundamental limitation in incremental language processing models, which have been successfully applied to a variety of natural language processing tasks, thus we anticipate the proposal to have a wide academic impact. Furthermore, the tasks we will evaluate it on, namely natural language generation and semantic parsing, are essential components to natural language interfaces and personal digital assistants. Improving these technologies will enhance accessibility to digital information and services. We will demonstrate the benefits of our approach through our collaboration with our project partners Amazon who are supporting the proposal both in terms of cloud computing credits but also by hosting the research associate in order to apply the outcomes of the project to industry-scale datasets.
Planned Impact
- Economy
The two applications we will focus on in the project, natural language generation and semantic parsing, are key technologies in a variety of commercial products which require generating and understanding language. In particular, personal digital assistants such as Google Now, Microsoft's Cortana, Amazon's Alexa and Apple's Siri are used by millions of users at home or on their mobile devices and are of great importance to these companies since they act as gateways to many of the services and products offered by them.
- Society
Personal digital assistants and natural language interfaces are used by a large number of users. Thus improving technologies of language generation and semantic parsing through non-monotonic incremental language processing is likely to affect these end users by improving their experience. We will explore this during the research visit of the RA at Amazon and test our approach in the context of Alexa.
- Knowledge
The project aims to address a fundamental limitation in an approach successfully applied to a variety of natural language processing tasks. Thus we anticipate that we will publish our results in high profile natural language processing conferences. Furthermore, we will accompany the paper publications with open source implementation of our approach on the project github repository.
- People
The project will have a positive impact on the careers of both the PI and the RA. It will enable the PI to build on his success and expertise he has developed in incremental language processing using imitation learning, and thus solidify his position in the field while simultaneously addressing a fundamental shortcoming in the approach. An EPSRC first grant would be of great significance to the PI as it will be his first time proposing and delivering a project on his own, which will provide him with useful experience and strengthen his profile in applying for further funding. Finally, the named RA has been working in language generation throughout his career and most recently with the PI in applying imitation learning to this task achieving state-of-the-art results.
The two applications we will focus on in the project, natural language generation and semantic parsing, are key technologies in a variety of commercial products which require generating and understanding language. In particular, personal digital assistants such as Google Now, Microsoft's Cortana, Amazon's Alexa and Apple's Siri are used by millions of users at home or on their mobile devices and are of great importance to these companies since they act as gateways to many of the services and products offered by them.
- Society
Personal digital assistants and natural language interfaces are used by a large number of users. Thus improving technologies of language generation and semantic parsing through non-monotonic incremental language processing is likely to affect these end users by improving their experience. We will explore this during the research visit of the RA at Amazon and test our approach in the context of Alexa.
- Knowledge
The project aims to address a fundamental limitation in an approach successfully applied to a variety of natural language processing tasks. Thus we anticipate that we will publish our results in high profile natural language processing conferences. Furthermore, we will accompany the paper publications with open source implementation of our approach on the project github repository.
- People
The project will have a positive impact on the careers of both the PI and the RA. It will enable the PI to build on his success and expertise he has developed in incremental language processing using imitation learning, and thus solidify his position in the field while simultaneously addressing a fundamental shortcoming in the approach. An EPSRC first grant would be of great significance to the PI as it will be his first time proposing and delivering a project on his own, which will provide him with useful experience and strengthen his profile in applying for further funding. Finally, the named RA has been working in language generation throughout his career and most recently with the PI in applying imitation learning to this task achieving state-of-the-art results.
People |
ORCID iD |
Andreas Vlachos (Principal Investigator) |
Publications
Finnimore P
(2019)
Strong Baselines for Complex Word Identification across Multiple Languages
Hardy
(2019)
HighRES: Highlight-based Reference-less Evaluation of Summarization
in Apollo - University of Cambridge Repository
Hargreaves J
(2021)
Incremental Beam Manipulation for Natural Language Generation
Joseph Fisher
(2019)
Merge and Label: A novel neural network architecture for nested NER
in Apollo - University of Cambridge Repository
Description | During this project we had the following findings: - We were able to improve incremental language generation in the context of summarization with state of the art sequence2sequence models by modifying the decoder to predict words taking into account the source document directly - We were able to improve incremental structured prediction in the context of nested named entity recognition developing a method that maintains scores for multiple hypotheses at different levels of granularity and then combines them to reach the final result improving the state of the art. - We improved on incremental generative discourse parsing with neural models by developing an improved beam search algorithm that avoids biases present in the previous ones, achieving state of the art results - We proposed a better evaluation method of summarization using human annotators. - We proposed multilingual baseline to identify words that need to be simplified and could be provide a starting point for non-monotonic incremental manipulation |
Exploitation Route | The improved structured prediction models for nested named entity recognition and discourse parsing are likely to be used by others in their work. Furthermore, the methods themselves are applicable to other tasks, since approaches such as beam search are omnipresent in the field. The summarization evaluation is likely to influence other researchers in how they conduct and evaluate their work. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The postdoc working on this grant is now applying his expertise in incremental language processing in commercial applications at Huawei. |
First Year Of Impact | 2019 |
Sector | Digital/Communication/Information Technologies (including Software) |
Impact Types | Economic |
Description | EPSRC Research Grant: Opening Up Minds: Engaging Dialogue Generated From Argument Maps |
Amount | £850,000 (GBP) |
Funding ID | EP/T024666/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2020 |
End | 08/2022 |
Title | Model for improving summarization with source document predictions |
Description | This code implements our proposal for improving the output of summarization with information from the original document. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | It achieved state of the art results on a well studied dataset. |
URL | https://github.com/sheffieldnlp/AMR2Text-summ |
Title | Software and model for state of the art nested named entity recognition |
Description | introduce a novel neural network architecture that first merges tokens and/or entities into entities forming nested structures, and then labels each of them independently. Unlike previous work, our merge and label approach predicts real-valued instead of discrete segmentation structures, which allow it to combine word and nested entity embeddings while maintaining differentiability. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The software and model proposed achieved state of the art result on the most commonly used datasets for nested NER. |
URL | https://github.com/fishjh2/merge_label |
Title | Software, methodology for improved human evaluation of summarization |
Description | We proposed a novel approach for manual evaluation, HIGHlight-based Reference-less Evaluation of Summarization (HIGHRES), in which summaries are assessed by multiple annotators against the source document via manually highlighted salient content in the latter. Thus summary assessment on the source document by human judges is facilitated, while the highlights can be used for evaluating multiple systems. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | It proposed an efficient and improved way to conduct human evaluation for a very commonly studied task. |
URL | https://github.com/sheffieldnlp/highres |
Description | Human evaluation for automatic summarization |
Organisation | |
Department | Google UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | We collaborated with Shashi Narayan from Google research to develop novel methods for human evaluation for automatic summarization. |
Collaborator Contribution | Our partner's researcher was involved in designing the study, writing the paper and presenting it. |
Impact | The collaboration resulted in a paper accepted at the top conference (ACL2019) in our field for an oral presentation: https://arxiv.org/abs/1906.01361 |
Start Year | 2018 |
Description | Neural Incremental Discourse Parsing |
Organisation | DeepMind Technologies Limited |
Country | United Kingdom |
Sector | Private |
PI Contribution | We contributed expertise on training incremental language processing models |
Collaborator Contribution | DeepMind provided expertise on discourse parsing and neural models |
Impact | A jointly authored paper was published at EMNLP 2019: https://arxiv.org/pdf/1907.00464.pdf |
Start Year | 2018 |
Title | Software implementing incremental text prediction for summarization with side information |
Description | It allows to edit the predictions of incremental models to take into account side information to improve their outputs. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | Achieved state of the art results on a well known dataset. |
URL | https://github.com/sheffieldnlp/AMR2Text-summ |
Description | Talk at Amazon Research Day in Cambridge |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | About 80 Amazon employees attended my talk which resulted in increased interactions and exploration of possible collaborations. |
Year(s) Of Engagement Activity | 2018 |
URL | https://ard.amazon-ml.com/cambridge/ |
Description | Talk at Google Research London |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Begin collaboration with Google research |
Year(s) Of Engagement Activity | 2019 |
Description | Talk at Lancaster University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Talk at one of the largest NLP research groups in the country. |
Year(s) Of Engagement Activity | 2019 |
Description | Talk at Technische Universita¨t Darmstadt |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Gave a talk on imitation learning research supported by this grant. Audience reported improved understanding of imitation learning. |
Year(s) Of Engagement Activity | 2018 |
Description | Talk at the Institute for Logic, Language and Computation, University of Amsterdam |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Gave a talk on imitation learning research supported by this grant. |
Year(s) Of Engagement Activity | 2018 |
Description | Talk at the NLP group at the Department of Computer Science at the University of Copenhagen |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Gave a talk on imitation learning research supported by this grant. Audience reported improved understanding of imitation learning. |
Year(s) Of Engagement Activity | 2018 |
Description | Talk at the University of Edinburgh |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Present results to one of the largest NLP research groups in the country. |
Year(s) Of Engagement Activity | 2019 |