DILiGENt: Domain-Independent Language Generation
Lead Research Organisation:
Heriot-Watt University
Department Name: S of Mathematical and Computer Sciences
Abstract
We propose a two year project to develop a novel data-driven methodology to rapidly create high quality NLG systems for new domains, by combining recent advances in three domains:
(1) advances in statistical models for NLG,
(2) crowdsourcing methods for natural language data collection, which have shown first promising results in related fields, such as Machine Translation, and
(3) recently developed imitation learning algorithms for structured prediction.
The project team combines expertise of two leading research groups in these areas:
At Heriot-Watt University, we recently demonstrated the potential for data-driven statistical NLG in limited domains. In order to make this framework domain-independent we will leverage recent machine learning models, developed by researchers at the University College London. These models learn by imitating the actions a human expert would perform to generate NL utterances, which we collect via a tightly integrated crowdsourcing procedure. The outcome of this work is a framework which will allow the rapid development of NLG systems for new domains, and thus accelerate the impact NLG technology has on the market.
We will showcase this framework on a dataset provided by the BBC, where we address the problem of generating weather reports for over 20,000 individual locations. Currently, the BBC website features only 10 reports written by meteorologists. Each of these reports covers a rather large area of the country (e.g. East of England), and thus of little interest to their users who are usually interested in the weather in a particular location (e.g. Norwich).
In a second, more ambitious step, we will explore how this framework scales to more complex interactive dialogue settings, where generation has to account for discourse phenomena, such as long-distance discourse relations or syntactic coordination. This will be evaluated in a shared task challenge for generation in interactive systems, hosted by Heriot-Watt University.
In sum, this project will further our understanding of domain-independent language generation, as well as deliver substantial and novel resources to support future research in this area (in the forms of code and data), and practical implementations of NLG systems in a wide-range of domains, from weather reports to natural language interfaces.
(1) advances in statistical models for NLG,
(2) crowdsourcing methods for natural language data collection, which have shown first promising results in related fields, such as Machine Translation, and
(3) recently developed imitation learning algorithms for structured prediction.
The project team combines expertise of two leading research groups in these areas:
At Heriot-Watt University, we recently demonstrated the potential for data-driven statistical NLG in limited domains. In order to make this framework domain-independent we will leverage recent machine learning models, developed by researchers at the University College London. These models learn by imitating the actions a human expert would perform to generate NL utterances, which we collect via a tightly integrated crowdsourcing procedure. The outcome of this work is a framework which will allow the rapid development of NLG systems for new domains, and thus accelerate the impact NLG technology has on the market.
We will showcase this framework on a dataset provided by the BBC, where we address the problem of generating weather reports for over 20,000 individual locations. Currently, the BBC website features only 10 reports written by meteorologists. Each of these reports covers a rather large area of the country (e.g. East of England), and thus of little interest to their users who are usually interested in the weather in a particular location (e.g. Norwich).
In a second, more ambitious step, we will explore how this framework scales to more complex interactive dialogue settings, where generation has to account for discourse phenomena, such as long-distance discourse relations or syntactic coordination. This will be evaluated in a shared task challenge for generation in interactive systems, hosted by Heriot-Watt University.
In sum, this project will further our understanding of domain-independent language generation, as well as deliver substantial and novel resources to support future research in this area (in the forms of code and data), and practical implementations of NLG systems in a wide-range of domains, from weather reports to natural language interfaces.
Planned Impact
(Please note that part of the following text is taken form the National Importance section of Part II)
Machine Learning for Natural Language Processing (NLP) is an area which has begun to generate positive economic impact, and start-ups and large companies such as Microsoft, Amazon, and Yahoo! are making substantial, new investments in this field. The UK has one of the highest proportion of world-leading NLG research centres. However, in contrast to other areas of NLP, the UK is still under-represented in using machine learning approaches for NLG. This proposal will help to strengthen this research strand in the UK.
From an end-user importance point of view, this work will be of interest to a wide range of businesses and companies in the UK (see letters of support). NLG creates more flexible output for interactive natural language interfaces. Repetitive linguistic output is one of the main problems for intelligent personal assistants (see for example Apple's Siri or Google Voice). As such, NLG technology has a strong potential for commercialisation, as the successful company ARRIA NLG has demonstrated.
The overall aim of this research - the rapid development of NLG systems for new domains - is to accelerate the impact that NLG technologies have on the market. To this end, we will assess our framework in a real-world applications, for example generating local weather reports for the BBC website. Furthermore, we will actively seek ways to commercialise this research, such as patenting and licensing of the framework that we develop.
From a societal importance point of view, NLG technology directly contributes to addressing key UK societal challenges: text simplification and data-to-text systems will be increasingly used to enable easy access to information across society. This will remove barriers to the use and understanding of information from large volumes of data. NLG systems have also been used in modern health technology, especially in the areas of personalised and localised healthcare.
Machine Learning for Natural Language Processing (NLP) is an area which has begun to generate positive economic impact, and start-ups and large companies such as Microsoft, Amazon, and Yahoo! are making substantial, new investments in this field. The UK has one of the highest proportion of world-leading NLG research centres. However, in contrast to other areas of NLP, the UK is still under-represented in using machine learning approaches for NLG. This proposal will help to strengthen this research strand in the UK.
From an end-user importance point of view, this work will be of interest to a wide range of businesses and companies in the UK (see letters of support). NLG creates more flexible output for interactive natural language interfaces. Repetitive linguistic output is one of the main problems for intelligent personal assistants (see for example Apple's Siri or Google Voice). As such, NLG technology has a strong potential for commercialisation, as the successful company ARRIA NLG has demonstrated.
The overall aim of this research - the rapid development of NLG systems for new domains - is to accelerate the impact that NLG technologies have on the market. To this end, we will assess our framework in a real-world applications, for example generating local weather reports for the BBC website. Furthermore, we will actively seek ways to commercialise this research, such as patenting and licensing of the framework that we develop.
From a societal importance point of view, NLG technology directly contributes to addressing key UK societal challenges: text simplification and data-to-text systems will be increasingly used to enable easy access to information across society. This will remove barriers to the use and understanding of information from large volumes of data. NLG systems have also been used in modern health technology, especially in the areas of personalised and localised healthcare.
Organisations
Publications
Bartie P.
(2016)
The REAL corpus: A crowd-sourced Corpus of human generated and evaluated spatial references to real-world urban scenes
in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
Cercas Curry A
(2019)
A Crowd-based Evaluation of Abuse Response Strategies in Conversational Agents
Cercas Curry A
(2021)
ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Detection in Conversational AI
Curry A.C.
(2018)
#MeToo: How conversational systems respond to sexual harassment
in Proceedings of the 2nd ACL Workshop on Ethics in Natural Language Processing, EthNLP 2018 at the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HTL 2018
Description | Natural language generation (NLG) plays a critical role for Conversational Agents as it has a significant impact on a user's impression of the system. So far, NLG systems are mainly hand-crafted or learned from semantically aligned data, both of which is expensive and non-scalable. This project addresses the problem of generating text from meaning representations (MRs), such as dialogue acts, which are not aligned with phrases in text. We have discovered that (1) Imitation Learning achieves results comparable to the state-of-the-art approaches, both in terms of automatic measures and human judgements (Lampouras & Vlachos, COLING 2016). (2) We can effectively crowd-source data of sufficient quality and quantity to train these algorithms. In particular, using pictorial representations of MRs reduces bias and elicits more syntactically varied and lexically rich data (Novikova, Lemon, Rieser, INLG 2016). (3) Current automatic evaluation metrics are not representative of human judgements (Novikova, Dusek, Rieser, EMNLP 2017). (4) End-to-end neural approaches are promising, but are often outperformed by hand-engineered systems in terms of overall quality, as well as complexity, length and diversity of outputs (Dusek, Novikova, Rieser, under submission 2018). Being able to effectively collect and generate from unaligned data, now enables us to rapidly develop NLG systems for new applications and domains. |
Exploitation Route | Heriot-Watt University is currently involved in a couple of knowledge transfer collaborations with industry, including EmoTech LTD, AdeptMind.ai and Amazon.com. |
Sectors | Digital/Communication/Information Technologies (including Software) Education Healthcare |
URL | http://diligent-project.tumblr.com/about |
Description | Sheffield's Imitation Learning framework is used by the popular Natural Language Processing toolkit spacy.io. Heriot-Watt University organised a shared research challenge aspart of this research. Our shared task received 62 submissions with diverse system architectures by 17 institutions from 11 countries, with about 1/3 of these submissions coming from industry. This work has substantially shaped the Natural Language Generation landscape: First, the dataset has become a widely used benchmark for model evaluation (as evidenced by citations). Second, our results have highly influenced academic discourse. We show that neural methods are prone to pathological output, such as hallucination and omission, which has led to new research avenues, including my group's work. Heriot-Watt also established a specialised postgraduate degree in this area, where techniques researched and developed as part of this grant are now taught. |
First Year Of Impact | 2017 |
Sector | Digital/Communication/Information Technologies (including Software),Education |
Impact Types | Economic |
Description | DATAIA scientific advisory board |
Geographic Reach | Europe |
Policy Influence Type | Participation in a guidance/advisory committee |
URL | https://www.dataia.eu/linstitut/le-conseil-scientifique |
Description | Member of the RSE Working Group on AI |
Geographic Reach | National |
Policy Influence Type | Membership of a guideline committee |
Description | New MSc Programme in Speech and Multimodal Interaction |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | Verena Rieser created a new postgraduate MSc programme at Heriot-Watt, which aims to educate highly employable experts in creating conversational multimodal interfaces. The programme recently received 6 fully funded studentships by the DataLab/ Scottish funding council. |
URL | http://www.macs.hw.ac.uk/cs/pgcourses/aiws.htm |
Description | AI for Good |
Amount | £15,000 (GBP) |
Organisation | Nesta |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2020 |
End | 09/2020 |
Description | Amazon Alexa Challenge 2017 |
Amount | $100,000 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start | 11/2016 |
End | 11/2017 |
Description | Amazon Alexa Challenge 2018 |
Amount | $250,000 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start | 02/2018 |
End | 12/2018 |
Description | DataLab MSc scholarships |
Amount | £36,000 (GBP) |
Organisation | Government of Scotland |
Department | Scottish Funding Council |
Sector | Public |
Country | United Kingdom |
Start | 08/2017 |
End | 08/2018 |
Description | DataLab knowledge exchange UK Industry |
Amount | £114,000 (GBP) |
Organisation | Government of Scotland |
Department | Scottish Funding Council |
Sector | Public |
Country | United Kingdom |
Start | 12/2016 |
End | 12/2017 |
Description | EPSRC First Grant |
Amount | £100,000 (GBP) |
Funding ID | EP/R021643/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2018 |
End | 06/2019 |
Description | EPSRC Impact Acceleration |
Amount | £45,000 (GBP) |
Organisation | Heriot-Watt University |
Sector | Academic/University |
Country | United Kingdom |
Start | 11/2017 |
End | 10/2018 |
Description | EPSRC Standard Grant |
Amount | £520,000 (GBP) |
Funding ID | EP/N017536/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 05/2016 |
End | 05/2019 |
Description | James Watt PhD Scholarship |
Amount | £40,000 (GBP) |
Organisation | Heriot-Watt University |
Sector | Academic/University |
Country | United Kingdom |
Start | 07/2016 |
End | 07/2019 |
Description | Leverhulme Trust Senior Research Fellowship 2020 |
Amount | £47,000 (GBP) |
Funding ID | SRF\R1\201100 |
Organisation | The Royal Society |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 08/2020 |
End | 08/2021 |
Description | SICSA Conference and workshop organisation |
Amount | £700 (GBP) |
Organisation | SICSA Scottish Informatics and Computer Science Alliance |
Sector | Academic/University |
Country | United Kingdom |
Start | 03/2015 |
End | 03/2015 |
Description | SICSA Postdoctoral and Early Career Researcher Exchanges (PECE) |
Amount | £2,028 (GBP) |
Organisation | SICSA Scottish Informatics and Computer Science Alliance |
Sector | Academic/University |
Country | United Kingdom |
Start |
Description | direct industry funding |
Amount | £88,000 (GBP) |
Organisation | AdeptMind Inc |
Sector | Private |
Country | Canada |
Start | 08/2017 |
End | 08/2020 |
Title | Alana Chatbot for Amazon Alexa US |
Description | We have created an open-domain chatbot called "Alana", which participated as on of the bots in the Amazon Alexa Challenge 2017. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | As part of this challenge, Alana reached over 360.000 of Amazon Alexa users in the US. Also, Amazon's research tool "CoBot" was largely inspired by our bot's architecture. While the tool is not yet available to the general public, we published a research paper providing details. |
URL | https://s3.amazonaws.com/alexaprize/2017/technical-article/alana.pdf |
Title | BLOOM Large Language Model |
Description | We created BLOOM the first publicly available large language model. This was a year-long collaboration as part of the BigScience workshop with several hundred of international scientists. I co-led one of the working groups. BLOOM stands for BigScience Large Open-science Open-access Multilingual Language Model. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | First publicly available "foundational model". Widely used and compared in the community. The ambition is to boost academic research and public benefits in competition to privately owned models, e.g. ChatGPT etc,. |
URL | https://huggingface.co/bigscience/bloom |
Title | Dockerized AMR-based semantic parser |
Description | We provided the dockerized AMR-based semantic parser learned with imitation learning which allows other researchers to reproduce our results. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | It has been downloaded more than 100 times at the time of writing. |
URL | https://hub.docker.com/r/andreasvlachos/semeval2016_amr_parser/ |
Title | E2E NLG Challenge: Benchmarking Neural vs. Handcrafted Approaches for Language Generation |
Description | Recent end-to-end data-driven Natural Language Generation (NLG) systems are promising to significantly reduce the need for annotated data and ultimately result in reduced development costs for NLG systems. We developed and organised the first shared task on end-to-end (E2E) NLG, which aims to assess whether these novel approaches can generate more complex output which would be suitable for real-world applications. Our shared task received 62 submissions with diverse system architectures by 17 institutions from 11 countries, with about 1/3 of these submissions coming from industry. We consider this level of participation an unexpected success, which underlines the timeliness of this task. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | We compare 62 submitted systems overing a wide range of approaches, with sequence-to-sequence (seq2seq) models being the most frequently used. We find that seq2seq systems generally score high in terms of word-overlap metrics and human evaluations of naturalness. However, they often fail to correctly express a given meaning representation if they lack a strong semantic control mechanism applied during decoding. Moreover, seq2seq models are often outperformed by hand-engineered systems in terms of overall quality, as well as complexity, length and diversity of outputs. |
URL | http://www.macs.hw.ac.uk/InteractionLab/E2E/ |
Title | RankMe: Reliable Human Ratings for Natural Language Generation |
Description | We have developed and published a new method for evaluating Natural Language Generation. In particular, the RankME method was shown to produce more reliable human ratings for NLG and related tasks. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | This method allows to get more reliable results regarding the quality of language generation systems. RankMe was developed for the E2E NLG task, which is a shared international challenge to benchmark end-to-end generation approaches. This methodology is currently also tested for open domain response generation in dialogue systems. A future version will be implemented as part of the Facebook Parl.AI platform. |
URL | http://www.macs.hw.ac.uk/InteractionLab/E2E/ |
Title | Software for learning NLG models from unaligned data |
Description | It achieved state of the art results on 3 datasets when compared to systems developed specifically for them and we made it available as open source. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | This system allowed us to participate in the E2E with strong results. |
URL | https://github.com/glampouras/JLOLS_NLG |
Title | Analogue corpus |
Description | This dataset was released with the submission of INLG 2016 paper "Crowd-sourcing NLG Data: Pictures Elicit Better Data" (https://aclweb.org/anthology/W/W16/W16-6644.pdf). The data was used to study crowd-sourcing techniques for NLG and will also be used to automatically train a NLG system. It contains pairs of textual meaning representations (MRs) and associated natural language utterances produced by crowd-workers. It also contains associated pictorial representations of MRs, as well as user ratings of the crowd-sourced utterances. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | We were invited to organise a shared task based on an extended version of this data set. Also, this data set is used by researchers at LORIA for their research (publication in submission). |
URL | https://github.com/jeknov/INLG_16_submission |
Title | NLG dataset |
Description | Natural Language Generation dataset collected using crowd-sourcing. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Resulted in a publication of the INLG 2016 conference paper "Crowd-sourcing NLG Data: Pictures Elicit Better Data" . |
URL | https://github.com/jeknov/INLG_16_submission |
Title | The E2E Challenge Dataset |
Description | The E2E dataset is a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | The E2E set was used in the E2E NLG Challenge we organized, which provides an extensive list of results achieved on this data. The interest in the E2E NLG shared task has by far outperformed our expectations. We received a total of 60 submissions by 16 institutions with about 1/3 of these submissions coming from industry. In comparison, the well established Conference in Machine Translation WMT'17 (running since 2006) got 31 institutions submitting to a total of 8 tasks. After the shared task was completed, the E2E dataset was made freely available online in full (including test data) and has been downloaded 41 times as of March 6, 2018. |
URL | http://www.macs.hw.ac.uk/InteractionLab/E2E/ |
Description | AdeptMind PhD overseas scholarship |
Organisation | AdeptMind Inc |
Country | Canada |
Sector | Private |
PI Contribution | We are applying and extending our research in data-driven Natural Language Generation techniques to e-commerce. In particular, we are inverstigating how to translate search results into textual descriptions which support the user in decision making. |
Collaborator Contribution | AdeptMind funds this oversea PhD scholarship with a £88k cash contribution, as well as other in-kind contributions, such as invited research visits (incl. travel costs), student training (incl. summer schools), as well as sharing data sets and sponsoring data collection. |
Impact | This collaboration brings together research on natural language processing and advances machine learning, as well as expertise in e-commerce platforms. |
Start Year | 2017 |
Description | Adobe Research |
Organisation | Adobe Inc. |
Country | United States |
Sector | Private |
PI Contribution | One of my PhD students went for a summer internship to Adobe, where he continued working on a topic related to his PhD. When he came back, we refined the research hypothesis, ran some more experiments, and submitted a paper to ACL (= premier conference in the field). We already received the scores and we think it islikely that the paper will be accepted. This paper would be a very good candidate to submit to REF. My student used high performance computing from the HWU equipment account. |
Collaborator Contribution | Adobe supervised my students work for 3 months during the internship and also paid his flight and salary. Adobe then also sent us a research gift of £7000 and encouraged us to apply for further funding. |
Impact | TBA |
Start Year | 2019 |
Description | Amazon Alexa Challenge 2017, 2018 |
Organisation | Amazon.com |
Country | United States |
Sector | Private |
PI Contribution | My team was selected to participate in the Amazon Alexa Challenge in two consecutive years: 2017 and 2018. The aim of this challenge is to build a social chat bot that can converse coherently and engagingly with humans on popular topics for 20 minutes. For the 2017 round, we were one of 12 teams selected out of a pool of over 100 applicants. For the 2018 round, we were 1 in eight teams selected out of ca. 200 applicants. |
Collaborator Contribution | We received a generous gift of $100,000 (2017) and $250,000 (2018) and various in-kind contributions worth ca. $100k for both years, e.g. free training and access to Amazon Web services, Alexa-enabled devices, weekly class with one of Amazon senior researchers, invited research visits to Amazon HQ in Seattle (including sponsored travel for the team) etc. We won 3rd prize for the 2017 challenge, which included a $50,000 cash prize for the students. |
Impact | Increased recognition and visibility of my research group and department. |
Start Year | 2016 |
Description | Amazon SimBot Challenge |
Organisation | Amazon.com |
Department | Amazon UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | My student team was selected to participate in the Amazon SimBot challenge. |
Collaborator Contribution | Our entry is supported with a grant from Amazon and in-kind contributions such as an invited visit to Amazon headquarters in Seattle as well as 2 days of workshops with Amazon staff. |
Impact | We expect a number of outcomes, including publications, student internships, and raising the international profile of our lab and university in this research area. |
Start Year | 2021 |
Description | Apple NLU research award |
Organisation | Apple |
Country | United States |
Sector | Private |
PI Contribution | This research gift supports research on low-resource Natural Language Generation. |
Collaborator Contribution | Research gift and monthly meetings. |
Impact | not yet |
Start Year | 2021 |
Description | EmoTech North Industry Knowledge Exchange |
Organisation | EmoTech Ltd |
Country | United Kingdom |
Sector | Private |
PI Contribution | we collaborate on designing and implementing a conversational interface for Olly the Robot - a product developed by Emotech Ltd, an in-home robot with conversational capabilities. The Olly robot recently won 4 awards for Innovation at the CES showcase. (The CES Innovation Awards is an annual competition honoring outstanding design and engineering in consumer technology products over the world.) Recently showcased at CES '17 http://www.bbc.com/news/technology-38504512 The project outcome will directly contribute the Olly product of Emotech. Emotech will release 1000-1500 units in June/July via a Kickstarter program to gauge early adopter feedback. Full commercial release is expected in Q3/4 2017 at a retail price of $600-800 per unit. The revenue of Emotech LTD in 2017 is estimated to be £2m, and is expected to grow to £20-40m in 2018. Emotech North Ltd will be a NLP(Natural Language Processing) hub for Emotech. Its growth will create more employment positions, more collaborations with other industry partners and universities in Scotland. |
Collaborator Contribution | Cash contribution of £58k to support RA. Invited research visit to London (1 week) fully supported. |
Impact | Robotics hardware, neuroscience, human-computer interaction |
Start Year | 2016 |
Description | Google Dialog and NLU research award |
Organisation | |
Country | United States |
Sector | Private |
PI Contribution | This research gifts supports an informal collaboration between Google Zurich and my group on topics related to dialogue systems and Natural Language Understanding. |
Collaborator Contribution | We received a research gift from Google to support research expenses. |
Impact | The award has supported my group with hardware, travel and data services (such as transcriptions and crowdsourcing) |
Start Year | 2020 |
Description | IDS Oil and Gas |
Organisation | Independent Data Services UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | IDS has been providing operational reporting solutions to the upstream oil and gas industry for over twenty years. This project provides consultancy how to best use NLP and machine learning methods to mine information available in Oil & Gas data sets. |
Collaborator Contribution | IDS has send one of their employees to work at Heriot-Watt for a week. Also, a cash contribution of £5k was made. |
Impact | This project combines Data Science with NLP and applies it to Oil & Gas. The outcomes of this collaboration will be directly used in IDS services. |
Start Year | 2018 |
Description | Liaising with the Met Office |
Organisation | Meteorological Office UK |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We are developing algorithms that will help generate textual forecast based on the weather data, which is a core task for the Met Office. |
Collaborator Contribution | The Met Office provides us with the textual forecasts and the weather data as well as advice on how to interpret. |
Impact | We developed better insights in how the textual forecasts are produced. This is multi-disciplinary, since it combines knowledge from meteorology and computational linguistics. |
Start Year | 2015 |
Title | E2E NLG Challenge Metrics |
Description | The software is an implementation of 5 respected automatic word-overlap-based metrics for scoring natural language (NLG) outputs. The metrics scripts were previously available elsewhere, but this piece of software unifies the interface and simplifies the usage of all 5 metrics, thus providing a more detailed evaluation of NLG outputs. |
Type Of Technology | Software |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | The software was used for evaluation in the E2E NLG challenge (https://github.com/tuetschek/e2e-metrics) as the official automatic evaluation procedure. It was used by the organizers for the final evaluation as well as by most of the 17 participants submitting NLG systems to the challenge during their system development. |
URL | http://www.macs.hw.ac.uk/InteractionLab/E2E/ |
Title | Imitation learning for natural language generation |
Description | Software that allows learning natural language model from un-aligned data. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | This software resulted in state-of-the-art performance in the 3 datasets. |
URL | https://github.com/glampouras/JLOLS_NLG |
Company Name | Alana |
Description | Alana develops machine learning and natural language processing software for use in a variety of sectors. |
Year Established | 2019 |
Impact | We are currently investigating several potential use cases with the Royal Blind and Education providers. |
Website | https://alanaai.com/ |
Company Name | Alana |
Description | Alana develops machine learning and natural language processing software for use in a variety of sectors. |
Year Established | 2019 |
Impact | The Alana conversational AI platform. Currently in final stages of negotiating a project with UNICEF on tackling covid misinformation. Formal partnership with RNIB to develop conversational interfaces for blind and partially sighted people. |
Website | https://alanaai.com/ |
Description | 1st Workshop on Data-to-text Generation |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | The 1st Workshop on data-to-text covers a broad spectrum of areas aimed at: generating textual descriptions from data, decision support systems to facilitate data access using natural language; information presentation from data, summarisation from data etc. It also aims to bridge the gap between Natural Language Generation and Data Science. We received 25 submissions, 6 of which were presented as talks and 19 as posters. One of the outcomes of this event is that this workshop will now be an annual event, following a similar informal format, as unanimously decided by the attendees. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.macs.hw.ac.uk/InteractionLab/d2t/ |
Description | 6th Lisbon Machine Learning School |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The school covers a range of machine learning (ML) Topics, from theory to practice, that are important in solving natural language processing (NLP) problems that arise in the analysis and use of Web data. |
Year(s) Of Engagement Activity | 2016 |
URL | http://lxmls.it.pt/2016 |
Description | Amazon Alexa Summit 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Invited 3-day visit and presentation at Amazon HQ in Seattle to take part in Alexa Summit/ Symposium directed to industry practitioners and postgrad students. |
Year(s) Of Engagement Activity | 2017 |
Description | BBC The Joy of AI |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Media (as a channel to the public) |
Results and Impact | My research was featured in the BBC's documentary "The Joy of AI" |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.bbc.co.uk/programmes/p06jt7j4 |
Description | BBC interview |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Interview for BBC technology news |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.bbc.co.uk/news/technology-51064369 |
Description | CNBC Interview |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Interview with CNBC on AI trends/ research predictions for 2022 |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.cnbc.com/2022/01/07/deep-learning-and-large-language-how-ai-is-set-to-evolve-in-2022.htm... |
Description | DATAIA invited talk |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | I am invited to the DATAIA Institute, the French institute on AI to give a talk |
Year(s) Of Engagement Activity | 2020 |
URL | http://dataia.eu/en/events/dataia-seminar-how-machines-learn-talk-challenges-and-opportunities-neura... |
Description | Data Science Athens meetup talk |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Andreas Vlachos gave a talk at the data science meetup in Athens and had the opportunity to inform the audience about the research conducted in this grant. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.meetup.com/Data-Science-Athens/events/236632141/ |
Description | Diversity and inclusion in academic ICT research |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Study participants or study members |
Results and Impact | I am taking part in the focus group Diversity and inclusion in academic ICT research run by the EPSRC and organised by Edinburgh Napier University. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.epsrc.ac.uk/newsevents/news/ictdiversityinclusionresearch/ |
Description | Edinburgh Science Festival talk + discussion |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Ondrej Dusek gave a talk about the latest developments in conversational assistants (i.e., spoken dialogue systems) at the Edinburgh Science Festival event called "Your Robot Roommate", which concentrated on human-robot interactions. The event included talks by two other researches (Kerstin Dautenhahn, Boris Mocialov) and a panel discussion about the future of robotics. The event was attended by about 30 members of public, who were very interested in the subject and posed a lot of questions during the panel discussion and immediately after the event. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.sciencefestival.co.uk/event-details/your-robot-roommate |
Description | Interview for international news (WDR) |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Interview for German national radio - almost whole feature around our group and our research. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www1.wdr.de/mediathek/audio/wdr3/wdr3-kulturfeature/audio-sprich-mit-mir---versuche-mit-masc... |
Description | Interview for national news (Telegraph) |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Interview for the Telegraph about Women in AI |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.telegraph.co.uk/technology/2019/03/08/artificial-intelligence-has-gender-problem-meet-pi... |
Description | Invited blog post on Understanding Uncertainty |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | I was invited to write an article for the blog by Prof Spiegelhalter (Winton Professor for the Public Understanding of Risk at Cambridge University) on ``Understanding Uncertainty" summarising my research on multimodal information presentation to communicate risk for decision support. |
Year(s) Of Engagement Activity | 2016 |
URL | https://understandinguncertainty.org/women-listen-and-men-look-how-best-communicate-risk-support-dec... |
Description | Invited industry talk at Thomson Reuters |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Verena Rieser was invited to present her research to Thomson Reuters via an online seminar. This seminar will be broadcasted to all research employees of Thomson Reuters worldwide. |
Year(s) Of Engagement Activity | 2017 |
Description | Invited keynote talk |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Invited keynote talk at Swisstext, an annual conference bringing businesses and academic practitioners together. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.swisstext.org/2017/index.html |
Description | Invited seminar talk at Charles University, Prague |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Ondrej Dusek gave a seminar talk about natural language generation (NLG) using sequence-to-sequence neural networks at the Institute of Formal and Applied Lingusitics, Charles University in Prague. The talk was attended by around 20 faculty members and PhD students at the institute and sparked a lively discussion about NLG research. |
Year(s) Of Engagement Activity | 2017 |
URL | http://ufal.mff.cuni.cz/events/sequence-sequence-natural-language-generation-spoken-dialogue-systems |
Description | Invited seminar talk at the University of Pennsylvania, US. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | Verena Rieser gave an invited seminar talk at the University of Pennsylvania on: "From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again" |
Year(s) Of Engagement Activity | 2017 |
URL | https://pricelab.sas.upenn.edu/clunch16-17 |
Description | Invited talk at the Lisbon.AI meetup |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | I gave a talk at the Lisbon.AI meetup, which was attended by 100 participants and disseminated the work done in this project. |
Year(s) Of Engagement Activity | 2017 |
URL | http://lisbon.ai/ |
Description | Media Interviews regarding the Amazon Alexa Challenge |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Interview at Scottish TV "Live at 5" about the Amazon Alexa Challenge. Several related newspaper articles, e.g. Sunday Herald, the Scotsman, etc. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.youtube.com/watch?v=ERSMhGFBxvw |
Description | Motion in Scottish Parliament recognising Amazon Alexa Prize |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Policymakers/politicians |
Results and Impact | Motion S5M-09326: Parliament congratulates Heriot-Watt University on its success in the Amazon Alexa Prize |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.parliament.scot/parliamentarybusiness/28877.aspx?SearchType=Advance&ReferenceNumbers=S5M-... |
Description | NESTA 12 Women shaping AI |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | My profile was featured as one of 12 Women Shaping AI by NESTA. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.nesta.org.uk/feature/12-women-ai/ |
Description | NESTA interview - 12 women shaping AI |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Media interview and article published by NESTA (global innovation foundation) |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.nesta.org.uk/feature/12-women-ai/ |
Description | NSF Report |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | Expert consultation by the National Science Foundation, USA |
Year(s) Of Engagement Activity | 2022 |
URL | https://arxiv.org/abs/2203.10012 |
Description | Native Scientist German School Outreach |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Schools |
Results and Impact | Verena Rieser engaged school children in her research. The half-day event was organised by Alleman Fun (German Saturday School) and Native Scientist. The engagement activity was held in German. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.macs.hw.ac.uk/RoboticsLab/news/german-native-scientist-volunteers-reaching-out-to-childre... |
Description | Organisation and programme chair for 9th International Conference on Natural Language Generation |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Programme chair, local host and organisation for 9th International Conference on Natural Language Generation. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.macs.hw.ac.uk/InteractionLab/INLG2016/# |
Description | Panel member at CogX 2017, London |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | CogX was a two day event specifically focused on the impact that AI has across industry, government and society as a whole. The event included a Trade Expo, two days of discussions, debates and breakout sessions, as well as the inaugural CogX Awards. It attracted over 1,500 attendees to explore the most topical AI trends, the first and second order effects of the AI revolution, and the challenges, opportunities and recommendations for how to navigate the new landscape which is rapidly reshaping the world around us. |
Year(s) Of Engagement Activity | 2017 |
URL | https://cogx.co/cogx2017/speakers/ |
Description | Plenary keynote at 1st workshop on NLP for Conversational AI (ACL2019, Florence) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited plenary keynote at 1st workshop on NLP for Conversational AI (ACL2019, Florence) |
Year(s) Of Engagement Activity | 2019 |
URL | https://sites.google.com/view/nlp4convai/program?authuser=0 |
Description | Plenary keynote at 2nd workshop on Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR-2019) (Turing Institute, London) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited plenary keynote at 2nd workshop on Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR-2019) (Turing Institute, London) |
Year(s) Of Engagement Activity | 2019 |
URL | http://vihar-2019.vihar.org/keynotes/ |
Description | Plenary keynote at IVA 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited plenary keynote at 19th International Conference on Intelligent Virtual Agents (IVA 2019, Paris) |
Year(s) Of Engagement Activity | 2019 |
URL | https://iva2019.sciencesconf.org/ |
Description | Scottish Minister for Higher Education visit |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Policymakers/politicians |
Results and Impact | Shirley-Anne Somerville, Minister for Higher Education, Further Education and Science, travelled to the Edinburgh Campus to meet with members of the University's 'What's Up Bot' team who recently returned from Amazon's prestigious AI competition, the Alexa Challenge. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.hw.ac.uk/about/news/minister-given-insight-into-the-future-of-ai.htm |
Description | Short Interview for Herald Scotland |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Ondrej Dusek gave a short e-mail interview to Herald Scotland regarding the future of human-robot interaction and was quoted in an article "Tommorrow's World". |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.heraldscotland.com/life_style/15166452.Tomorrow__39_s_World/ |
Description | SigDial panel discussion |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited Panel Member at 18th Annual SIGdial Meeting on Discourse and Dialogue Conference discussing "Natural Language Generation for Spoken Dialogue Systems". Other panel members were from Google (Head of NLG), Cambridge University and Bloomberg. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.superlectures.com/sigdial2017/panel-discussion |
Description | Talk at Amazon Research Cambridge |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Andreas Vlachos gave an invited talk at Amazon research which resulted in very detailed discussions about imitation learning methods and their application to natural langauge understanding and generation problems. |
Year(s) Of Engagement Activity | 2016 |
Description | Talk at Facebook Artificial Intelligence Research |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Andreas Vlachos gave an invited talk at Facebook AI research which resulted in very detailed discussions about imitation learning methods and their combination with neural network architectures. |
Year(s) Of Engagement Activity | 2016 |
Description | Talk at the The Center for Information and Language Processing at the University of Munich |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Invited talk at this group, one of the most successful wordlwide. |
Year(s) Of Engagement Activity | 2017 |
Description | Talk at the University of Cambridge |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | I gave a talk at the unversity of Cambridge Natural Language and Information Processing group. |
Year(s) Of Engagement Activity | 2017 |
Description | Telegraph Pioneering Women in AI |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | My profile was featured in the Telegraph as a Pioneering Women in AI |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.telegraph.co.uk/technology/2019/03/08/artificial-intelligence-has-gender-problem-meet-pi... |
Description | Top 30 people to follow on Twitter |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | I was nominated as on in 30 top European people in AI to follow on Twitter. |
Year(s) Of Engagement Activity | 2019 |
URL | https://sifted.eu/articles/30-ai-people-in-europe-to-follow-on-twitter/ |
Description | Tutorial on imitation learning at the Conference of the European Chapter of the Association for Computational Linguistics |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Andreas Vlachos gave a tutorial at the flagship European conference in natural language processing together with Gerasimos Lampouras and Sebstian Riedel (Co-Inverstigator in Diligent). |
Year(s) Of Engagement Activity | 2017 |
URL | http://eacl2017.org/index.php/tutorials |
Description | Website for the Diligent project |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | All the project-related information of the Diligent project may be found on the website. |
Year(s) Of Engagement Activity | 2016 |
URL | http://diligent-project.tumblr.com |
Description | Women@CS |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Undergraduate students |
Results and Impact | Verena Rieser organises a local support group for female students studying Computer Science, inspired by the "Sisters Clubs" in American universities. The goal is to attract and retain female UG students to study CS. |
Year(s) Of Engagement Activity | 2016 |
Description | top 30 women in AI: UK Edition |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | I was nominated as #5 of top Women in AI in the UK by Re:Work |
Year(s) Of Engagement Activity | 2019 |
URL | https://blog.re-work.co/top-30-women-in-ai-uk-edition/ |