CiViL: Common-sense- and Visually-enhanced natural Language generation

Lead Research Organisation: Edinburgh Napier University

Department Name: School of Computing

Abstract

One of the most compelling problems in Artificial Intelligence is to create computational agents capable of interacting in real-world environments using natural language. Computational agents such as robots can offer multiple benefits to society, for instance, they can be used to look after the ageing population, act as companions, can be used for skills training or even provide assistance in public spaces. These are extremely challenging tasks due to their complex interdisciplinary nature, which spans across several fields including Natural Language Generation, engineering, computer vision, and robotics.

Communication through language is the most vital and natural way of interaction. Humans are able to effectively communicate with each other using natural language, utilising common-sense knowledge and by making inferences about other people's backgrounds based on previous interactions with them. At the same time, they can successfully describe their surroundings, even when encountering unknown entities and object. For decades, researchers have tried to recreate the way humans communicate through natural language and although there are major breakthroughs during recent years (such as Apple's Siri or Amazon's Alexa), Natural Language Generation systems still lack the ability to reason, exploit common-sense knowledge, and utilise multi-modal information from a variety of sources such as knowledge bases, images, and videos.

This project aims to develop a framework for common-sense- and visually- enhanced Natural Language Generation that can enable natural real-time communication between humans and artificial agents such as robots to enable effective collaboration between humans and robots. Human-Robot Interaction poses additional challenges to Natural Language Generation due to uncertainty derived from the dynamic environments and the non-deterministic fashion of interaction. For instance, the viewpoint of a situated robot will change when the robot moves and hence its representation of the world, which will result in failure of current state-of-art methods, which are not able to adapt to changing environments. The project aims to investigate methods for linking various modalities, taking into account their dynamic nature. To achieve natural, efficient and intuitive communication capabilities, agents will also need to acquire human-like abilities in synthesising knowledge and expression. The conditions under which external knowledge bases (such as Wikipedia) can be used to enhance natural language generation still have to be explored as well as whether existing knowledge bases are useful for language generation.

The novel ways to integrate multi-modal data for language generation will lead to more robust and efficient interactions and will have an impact on natural language generation, social robotics, computer vision, and related fields. This might, in turn, spawn entirely novel applications, such as explaining exact procedures for e-health treatments and enhance tutoring systems for educational purposes.

Planned Impact

Autonomous and intelligent systems are becoming prevalent. The International Federation of Robotics reports that in 2017, Europe increased their sales for personal/domestic robots by 25% to about 8.5 million units (value ~US$2.1bn) [1]. They are projecting a growth of 30-35% per year until 2020 for household robotics, which will be responsible for a variety of tasks ranging from repetitive tasks, such as household maintenance, to looking after the ageing population, assisting people with disabilities as well as education and entertainment. The Office of National Statistics reports that the UK's population is getting older with almost one-fifth of the population aged 65 and over in 2016. Additionally, according to the UK Government data, 22% of UK citizens reported a disability in 2016/17, ranging from mobility disabilities to mental health and vision impairments [2]. These challenges open up a plethora of opportunities for care and assistive robots, which are able to effectively communicate with humans of all ages in an intuitive and effective manner. The most intuitive mode of communication between robots and humans is through natural language. The interaction normally takes place in a situated environment, e.g. at home or at work, where the need for recognising and understanding the surroundings is important as well as being able to associate common-sense knowledge to make further inferences.

In addition to the international academic community, other stakeholders will benefit from this research:

- The results of this research will have a long-term influence on new applications with the aim to improve health, well-being, and quality of life as well as enable equal opportunities for education and health for all citizens. For instance, novel health applications will be able to improve people's mental health by offering support and hence reduce the financial burden of health services. Innovative education applications will offer everyone the opportunity to learn, retrain, or upscale skills, for instance, robots used for training by showing and explaining how to perform a task, and provide feedback and guidance by being able to recognise how humans interact with objects.

- The standardisation of natural language generation technologies will provide evidence which will inform public policies at national and international level.

- Social robots will help businesses reduce costs for training, especially in cases where training can be associated with high costs and when precision is detrimental. In addition, by enhancing knowledge and understanding of the underlying technologies, innovative industrial applications will be realised which in turn will create opportunities for high-skilled roles and offer opportunities for foreign investments.

References:

[1] The International Federation of Robotics. WR 2018 Service Robots Executive Summary_revised (accessed June 2019). https://ifr.org/downloads/press2018/Executive_Summary_WR_Service_Robots_2018.pdf
[2] Department for Work & Pensions. The Family Resources Survey (accessed June 2019). https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/692771/family-resources-survey-2016-17.pdf

Funded Value:

£280,059

Funded Period:

Sep 20 - Sep 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/T014598/1

Principal Investigator:

Dimitra Gkatzia

Research Subject:

Info. & commun. Technol. (68%)

Linguistics (32%)

Research Topic:

Artificial Intelligence (33%)

Computational Linguistics (32%)

Human-Computer Interactions (10%)

Image & Vision Computing (25%)

Organisations

People	ORCID iD
Dimitra Gkatzia (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Agarwal S (2020) Proceedings of the 1st Workshop on Evaluating NLG Evaluation (EvalNLGEval)

Buschmeier H (2020) Second Workshop on Natural Language Generation for Human-Robot Interaction

Clinciu M (2021) It's Commonsense, isn't it? Demystifying Human Evaluations in Commonsense-Enhanced NLG Systems

Clinciu M. (2021) It's Common Sense, isn't it? Demystifying Human Evaluations in Commonsense-enhanced NLG systems in Human Evaluation of NLP Systems, HumEval 2021 - Proceedings of the Workshop, as part of the 16th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2021

Gkatzia D (2021) "What's this?" Comparing Active learning Strategies for Concept Acquisition in HRI

Howcroft D (2020) Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

Howcroft D.M. (2020) Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions in INLG 2020 - 13th International Conference on Natural Language Generation, Proceedings

Howcroft D.M. (2022) Most NLG is Low-Resource: here's what we can do about it in GEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop

Miltenburg E (2021) Underreporting of errors in NLG output, and what to do about it

Panagiaris N (2021) Generating unambiguous and diverse referring expressions in Computer Speech & Language

Panagiaris N (2020) Improving the Naturalness and Diversity of Referring Expression Generation models using Minimum Risk Training

Panagiaris N (2020) Generating unambiguous and diverse referring expressions in Computer Speech & Language

Panagiaris N. (2020) Improving the Naturalness and Diversity of Referring Expression Generation models using Minimum Risk Training in INLG 2020 - 13th International Conference on Natural Language Generation, Proceedings

Plant R. (2021) CAPE: Context-Aware Private Embeddings for Private Language Learning in EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

Strathearn C (2022) Task2Dial: A Novel Task and Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents

Strathearn C (2021) Chefbot: A Novel Framework for the Generation of Commonsense-enhanced Responses for Task-based Dialogue Systems

Strathearn C (2024) A Hybrid Rule-based and Generative Language Model for Flexible Instructional Dialogue

Strathearn C (2023) TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction

Strathearn C (2022) Task2Dial: A Novel Task and Dataset for Commonsense enhanced Task-based Dialogue Grounded in Documents

Strathearn C (2021) Task2Dial: A Novel Task and Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents

Strathearn C (2023) Analysis and Application of Natural Language and Speech Processing

Strathearn C (2022) A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue in Analysis and Application of Natural Language and Speech Processing

Strathearn C. (2021) Chefbot: A Novel Framework for the Generation of Commonsense-enhanced Responses for Task-based Dialogue Systems in INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings

Strathearn C. (2021) Task2Dial: A Novel Task and Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents in ICNLSP 2021 - Proceedings of the 4th International Conference on Natural Language and Speech Processing

Van Miltenburg E (2023) Barriers and enabling factors for error analysis in NLG research in Northern European Journal of Language Technology

Key Findings
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Engagement Activities


Description	Natural language generation (NLG) is a critical part of task-based conversational systems as it has a significant impact on a user's experience with the systems. So far, task-based dialogue systems are able to focus on one task, and cannot easily diverge from it even if it is necessary in order to complete the original task. This project addresses the problem of generating dialogue responses that display 'commonsense' abilities and has developed a novel framework that brings together commonsense-enhanced NLG and flexible dialogue management.
Exploitation Route	The code and datasets developed as part of this funding are open-source and available for other researchers to use.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	Enhancing Labour Market Intelligence using Machine Learning
Amount	£60,000 (GBP)
Organisation	Skills Development Scotland
Sector	Public
Country	United Kingdom
Start	08/2021
End	10/2025


Description	Natural language interfaces to support career decision-making of young people
Amount	£60,000 (GBP)
Organisation	Skills Development Scotland
Sector	Public
Country	United Kingdom
Start	08/2020
End	10/2024


Description	Scottish Gaelic Generation for Exhibitions
Amount	£5,000 (GBP)
Organisation	Arts & Humanities Research Council (AHRC)
Sector	Public
Country	United Kingdom
Start	02/2022
End	09/2022


Description	Sentinel: Security alert level automation
Amount	£5,000 (GBP)
Organisation	Government of Scotland
Department	Scottish Funding Council
Sector	Public
Country	United Kingdom
Start	11/2021
End	01/2022


Title	CEC - Commonsense Evaluation Card
Description	The Commonsense Evaluation Card (CEC) aims to standardise human evaluation and reporting of commonsense-enhanced NLG systems, enabling researchers to compare models not only in terms of classic NLG quality criteria, but also by focusing on the core capabilities of such models.
Type Of Material	Improvements to research infrastructure
Year Produced	2021
Provided To Others?	Yes
Impact	This tool has helped in better documenting experiments related to commonsense knowledge.
URL	https://nlgknowledge.github.io/commonsense/


Title	Task2Dial
Description	TBA
Type Of Material	Database/Collection of data
Year Produced	2021
Provided To Others?	Yes
Impact	TBA
URL	https://huggingface.co/datasets/cstrathe435/Task2Dial


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	Charles University
Country	Czech Republic
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	Georgetown University
Country	United States
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	Heriot-Watt University
Country	United Kingdom
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	Trivago NV
Country	Germany
Sector	Private
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	University of Aberdeen
Country	United Kingdom
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	University of Helsinki
Country	Finland
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	University of Tilburg
Country	Netherlands
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration on Providing Recommendations of Error Analysis of NLG systems
Organisation	University of Virginia (UVa)
Country	United States
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, trivago, Charles University in Prague, and others. All partners worked together to analyse the state of error reporting of NLG systems and provide recommendations so that future NLG publications discuss both the benefits but also the errors made by the systems with the aim to focus on bettering these aspects.
Collaborator Contribution	All partners worked together to analyse current trends in error reporting and provide recommendations on how error analysis in NLG systems should be performed with the aim to understand the limitations of current scientific advances.
Impact	Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson and Luou Wen. (2021). Underreporting of errors in NLG output, and what to do about it. In INLG 2021. Emiel Van Miltenburg, Miruna Clinciu, Ondrej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research. In Northern European Journal of Language Technology. 2023
Start Year	2021


Description	Multi-party collaboration/study on Evaluation of Commonsense-enhanced NLG systems
Organisation	Heriot-Watt University
Country	United Kingdom
Sector	Academic/University
PI Contribution	TBA
Collaborator Contribution	TBA
Impact	Miruna-Adriana Clinciu, Dimitra Gkatzia, Saad Mahamood. 2021. It's Commonsense, isn't it? Demystifying Human Evaluations in Commonsense-Enhanced NLG Systems. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval) at EACL 2021.
Start Year	2021


Description	Multi-party collaboration/study on Evaluation of Commonsense-enhanced NLG systems
Organisation	Trivago NV
Country	Germany
Sector	Private
PI Contribution	TBA
Collaborator Contribution	TBA
Impact	Miruna-Adriana Clinciu, Dimitra Gkatzia, Saad Mahamood. 2021. It's Commonsense, isn't it? Demystifying Human Evaluations in Commonsense-Enhanced NLG Systems. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval) at EACL 2021.
Start Year	2021


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	CVS Health
Country	United States
Sector	Private
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	Heriot-Watt University
Country	United Kingdom
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	Pompeu Fabra University
Country	Spain
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	Trivago NV
Country	Germany
Sector	Private
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	University of Brighton
Country	United Kingdom
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	University of North Carolina at Charlotte
Country	United States
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Multi-party collaboration/study on Evaluation of NLG systems
Organisation	University of Tilburg
Country	Netherlands
Sector	Academic/University
PI Contribution	This is a multi-partners collaboration between Edinburgh Napier, Heriot-Watt University, University of Brighton, trivago, CVS Health, Universitat Pompeu Fabra, Tilburg University and University of North Carolina at Charlotte. All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Collaborator Contribution	All partners worked together to collect, annotate and cure data regarding human evaluations of NLG systems with the aim to standardise evaluations and promote reproducibility of results.
Impact	David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser. 2020. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions. Proceedings of the 13th International Conference on Natural Language Generation. https://www.aclweb.org/anthology/2020.inlg-1.23/
Start Year	2020


Description	Article on CBC Kids
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	A contribution to an article to explain uncanny valley in robotics to kids.
Year(s) Of Engagement Activity	2023
URL	https://www.cbc.ca/kidsnews/post/exploring-the-uncanny-valley-tiktok-trend


Description	Blogpost at trivago.com website about our collaboration/joint work
Form Of Engagement Activity	Engagement focused website, blog or social media channel
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Public/other audiences
Results and Impact	Our industry collaborator published a blogpost about our recent works in Natural Language Generation. The website is visited by a large number of people internationally.
Year(s) Of Engagement Activity	2022
URL	https://tech.trivago.com/post/2022-03-31-improving-evaluation-practices-in-natural-language-generati...


Description	C Strathearn participated in the Robot Talk podcast
Form Of Engagement Activity	A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	Podcast about humanoid robots, realistic robot faces and speech including talking about CiViL.
Year(s) Of Engagement Activity	2023
URL	https://www.robottalk.org/2023/11/03/episode-60-carl-strathearn/


Description	Invited Talk at Verint
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I gave a talk at Verint, a global company specialising on Customer Experience using Automation, AI, and Cloud. Security and Intelligence mining software.
Year(s) Of Engagement Activity	2020


Description	Invited seminar talk at the National Research Council of Canada.
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	David Howcroft presented "Disentangling 20 years of confusion in NLG: toward standards for human evaluation" at the National Research Council of Canada's Natural Language Processing seminar, having been invited by Cyril Goutte. The discussion included useful similarities between evaluation for Natural Language Generation (NLG) and machine translation in particular, including gaps in terms of designing studies to measure the preferences of individual target groups as well as discussions of performing evaluation in low-resource settings.
Year(s) Of Engagement Activity	2021


Description	Invited to participate at a panel on Explainable AI at the inaugural Scottish AI Summit
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	I was invited to join a panel on why AI is still black box and discusses limitations and opportunities of explainable AI. The event was attended by 300 people in person and over 500 online. Attendees included politicians, academics, industry and third sector such as Unicef.
Year(s) Of Engagement Activity	2022
URL	https://www.scottishaisummit.com/


Description	Organisation of the 2nd Workshop on Natural Language Generation for Human-Robot Interaction (NLG4HRI)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	In Human-Robot Interaction (HRI), a primary goal is to develop robotic agents that exhibit socially intelligent behaviour while interacting with human partners. Despite the clear relationship between social intelligence and fluent, flexible linguistic interaction, in practice few interactive robots employ anything beyond a simple, hard-coded process when generating linguistic output. On the other hand, in Natural Language Generation (NLG), the sub-area of computational linguistics dedicated to producing high-quality natural-language output, increasingly sophisticated methods have been developed for language production. However, while the interactive settings and dynamic environments provided by HRI open up interesting research problems in NLG, this connection has not been extensively researched. The first workshop in this series, at the INLG 2018 conference, brought together members of the INLG and HRI research communities for a day of discussion and confirmed that there is mutual interest in exploring the possibilities of applying NLG techniques to problems drawn from HRI. At the current workshop, at the International Conference on Natural Language Generation, INLG 2020, we aim to bring those communities together again, this time with a concrete goal: to define one or more novel shared tasks, based on problems from HRI, that will allow NLG researchers to develop and compare techniques for generation in this space, and that will also allow HRI researchers to benefit from potentially higher-quality linguistic output in their applications. The workshop was of potential interest to researchers from other fields that focus on 'interaction' such as spoken dialogue systems, intelligent virtual agents, or intelligent user interfaces.
Year(s) Of Engagement Activity	2020
URL	https://hbuschme.github.io/nlg-hri-workshop-2020/


Description	Pc Pro Magazine Interview by Postdoc Carl Strathearn
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	interview for PC Pro magazine 'The UK's biggest selling PC monthly magazine' on building realistic humanoid robots with commonsense- and visually-enhanced language & dialogue capabilities.
Year(s) Of Engagement Activity	2021
URL	https://twitter.com/CarlStrathearn/status/1469975257217970180/photo/1


Description	Podcast interview by Postdoc Carl Strathearn
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	Discussion about humanoid robots and how to take them outside the uncanny valley.
Year(s) Of Engagement Activity	2021
URL	https://www.sciencefocus.com/future-technology/podcast-why-realistic-humanoid-robots-need-to-learn-t...


Description	Professorial Talk at Edinburgh Napier University open day
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Undergraduate students
Results and Impact	Around 160 students attended my professorial talk on "How close are we to achieving Human-like AI? From Eliza to Alexa and beyond", which described the current state of dialogue systems and natural language generation, discussed the limitations of current systems, and discussed the "misinformation" about AI as presented in media. The talk sparked a vivid discussion in the area.
Year(s) Of Engagement Activity	2021


Description	Robohub - Women in Robotics Spotlight
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The Women in Robotics spotlight was distributed via email and reached all members of the Robohub. It sparked interest in commonsense in Robotics and received invitations for participating in mentorship schemes and reviewing.
Year(s) Of Engagement Activity	2020
URL	https://robohub.org/women-in-robotics-update-ecem-tuglan-tuong-anh-ens-sravanthi-kanchi-kajal-gada-d...


Description	Special Session
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	We organised a special session on Natural Language in Human-Robot Interaction during the SIGDIAL conference. The session featured a panel of experts discussing challenges and opportunities in this interdisciplinary area, as well as technical talks.
Year(s) Of Engagement Activity	2022
URL	https://2022.sigdial.org/call-for-papers-nlihri/?fbclid=IwAR3TfY8gIhWedSTTHb4s1zFrgHoEImfMGHbGCUAq61...


Description	Workshop on Evaluating NLG Evaluation
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	This workshop is intended as a discussion platform on the status and the future of the evaluation of Natural Language Generation systems. Among other topics, we will discuss current evaluation quality, human versus automated metrics, and the development of shared tasks for NLG evaluation. The workshop also involves an 'unshared task', where participants are invited to experiment with evaluation data from earlier shared tasks.
Year(s) Of Engagement Activity	2020
URL	https://evalnlg-workshop.github.io/

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications