The Role of Neural Models in (Constrained) Natural Language Generation Built on the mathematical foundations laid out by Markov [1], n-gram language

Lead Research Organisation: University of Cambridge
Department Name: Engineering

Abstract

Built on the mathematical foundations laid out by Markov [1], n-gram language models endow a word with a probability distribution that depends on its context (the n words preceding it). While these models led to advances in language technologies as diverse as speech recognition [2] and machine translation [3, 4], the cardinality of their parametrisation grows exponentially with the context length. This limits their applicability. Recurrent neural networks can model arbitrary length contexts [5], which led to the widespread adoption of these models in many language tasks [6]. Overcoming the computational efficiency limitations of recurrent architectures, the neural "transformer" architecture proposed in [7] led to development of a range of neural language models that achieve state-of-the-art performance in a variety of natural language tasks [8, 9, 10].
This project aims to advance artificial intelligence (AI) based technologies for natural language by investigating how neural language models can be employed to incorporate user goals in natural language generation.
Such goals might be expressed through interaction with the system, as is the case in conversational AI. Inspired by recent developments in the field where neural models have been employed to replace complex model pipelines [11, 12, 13], the project will explore and provide novel methods that will allow these systems to:
better estimate, track and ground their output in user intent
be adapted or self-adapt to satisfy new user goals
Goals might also be expressed through constraints that a language engineer would like to embed in the generation task. For example, generation of gender inflections in languages where the latter depends on the social gender of a human referent is challenging for state-of-the-art automatic translation systems [14] and recent work has shown that embedding such constraints in neural models is non-trivial [15]. Solving this problem extends beyond gender bias mitigation, as similar approaches might be pursued to constrain neural systems to generate gender-neutral translations.
This project aims to take a broad approach to advancing state-of-the-art in user-constrained language generation, investigating:
appropriate data sources and their optimal representation
architectural changes necessary to accommodate new/richer input data representations
new training methodologies, including novel objectives and training in conjunction with other systems
novel decoding processes that account for constraints
adaptation techniques
References
[1] Markov, A. A. (1913). Essai d'une recherche statistique sur le texte du roman "Eugene Onegin" illustrant la liaison des epreuve en chain ('Example of a statistical investigation of the text of "Eugene Onegin" illustrating the dependence between samples in chain'). Izvistia Imperatorskoi Akademii Nauk (Bulletin de l'Academie Impériale des Sciences de St.-Pétersbourg), 7, 153-162.
[2] Povey, D., & Woodland, P. C. (2002, May). Minimum phone error and I-smoothing for improved discriminative training. In 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (Vol. 1, pp. I-105). IEEE.
[3] Chiang, D. (2005, June). A hierarchical phrase-based model for statistical machine translation. In Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL'05) (pp. 263-270).
[4] de Gispert, A., Iglesias, G., Blackwood, G., R. Banga, E., & Byrne, W. (2010). Hierarchical phrase-based translation with weighted finite-state transducers and shallow-n grammars. Computational linguistics, 36(3), 505-533.
[5] Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Eleventh annual conference of the international speech communication association (pp. 1045-1048).
[6] Jurafsky, D., &, Martin, J. H., (n.d). Sequence Processing with Neural Networks. In Speech and Language Processing: An Introd

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513180/1 01/10/2018 30/09/2023
2438674 Studentship EP/R513180/1 01/10/2020 31/01/2025 Alexandru Coca
EP/T517847/1 01/10/2020 30/09/2025
2438674 Studentship EP/T517847/1 01/10/2020 31/01/2025 Alexandru Coca
 
Description Task-oriented dialogues are carried out between a user and a virtual agent. These types of conversations allow users to access a variety of web services via natural language interfaces. This award has facilitated progress on constrained natural language generation for task-oriented dialogues, with a focus on a) improving evaluation methods for neural language generation models b) improving robustness of neural models that convert written dialogues to keyword sequences describing the user intention. For a), a novel evaluation protocol for neural models simulating human users has been proposed in "GCDF1: A Goal- and Context- Driven F-Score for Evaluating User Models", published at The First Workshop on Evaluations and Assessments of Neural Conversation Systems in 2021. For b), methods for improving the robustness of neural models for user intent detection have been developed. A series of two papers has been written on this topic. The first paper, entitled "More Robust Schema-Guided Dialogue State Tracking via Tree-Based Paraphrase Ranking" shows how paraphrase and semantic models can be applied to improve the robustness of user intent detection. The second, entitled "Grounding Description-Driven Dialogue State Trackers with Knowledge-Seeking Turns", shows how to exploit the diversity of human language in dialogue corpora to achieve the aforementioned objective.
Exploitation Route The discoveries made so far could contribute to the widespread deployment of language technologies for natural language interaction between humans and machine. If combined with advances in speech recognition technologies, this work could ultimately contribute to development of assistive technologies that people with visual disabilities can leverage to carry out a wide variety of tasks. Equally, the research produced is of importance for advancing the capabilities of language systems such as Siri or Alexa.
Sectors Digital/Communication/Information Technologies (including Software),Retail

 
Description Oral Presentation at The First Workshop for Evaluation and Assessment on Neural Conversation Systems (EANCS 2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I was invited to present "GCDF1: A Goal- and Context- Driven F-Score for Evaluating User Models" at the "The First Workshop on Evaluations and Assessments of Neural Conversation Systems" in 2021.
Year(s) Of Engagement Activity 2021
URL https://aclanthology.org/2021.eancs-1.0/
 
Description Poster Presentation at the 60th Annual Meeting of the Association for Computational Linguistics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact The work "uFACT: Unfaithful Alien-Corpora Training for Semantically Consistent Data-to-Text Generation", for which publication I was responsible and to which I made substantial intellectual contributions, was presented at the poster session of the ACL 2022 conference. This is a high profile conference and has allowed us to disseminate findings of our research to a broad audience.
Year(s) Of Engagement Activity 2022