REG Challenge 2008: A Shared Task Evaluation Event for Referring Expression Generation

Lead Research Organisation: University of Brighton
Department Name: Sch of Computing, Engineering & Maths

Abstract

Natural Language Generation (NLG) is the subfield of Natural Language Processing (NLP) that is concerned with developing computational methods for automatically generating language, with the primary aims of economising text-production processes (for example producing drafts of manuals or letters), and improving access to non-verbal information (for example creating verbal descriptions for visually impaired users). Comparing how well alternative computational methods perform the same task (or 'comparative evaluation') is an important component of the consolidation of research effort and technological progress in general. Comparative evaluation initiatives with associated competitions and events have been common in many NLP fields for some time, where they have been seen to galvanise research communities, create valuable new resources, and lead to rapid technological progress.NLG has strong evaluation traditions, in particular in user evaluations of application systems, but also in embedded evaluation of NLG components against non-NLG baselines or different versions of the same component. However, what has largely been missing are comparative evaluation results for comparable but independentlydeveloped NLP systems and tools. Right now, there are only two sets of such results. Over the past two years, NLG researchers have become increasingly interested in comparative evaluation. We believe that comparative evaluation initiatives will have many beneficial effects for NLG, including creation of resources, focussing research effort on specific tasks and attracting new researchers to the field. This year, we organised the Attribute Selection for Generating Referring Expressions (ASGRE) Challenge, which was a pilot NLG shared-task evaluation event. Participation was high and reactions from NLG researchers have been enthusiastic. We are therefore planning a full-scale NLG evalution initiative, the Referring Expressions Generation (REG) Challenge, for 2008. Unlike the two leading evaluation intitiatives in the neighbouring fields of Machine Translation and Document Summarisation, which are funded and directed by US government agencies, the ASGRE and REG Challenges are community-led, UK-based evaluation initiatives. This proposal requests funding for data preparation and evaluation activities in the 2008 REG Challenge, to enable us to extend the range of shared tasks and the evaluation programme, and to keep this initiative community-based and UK-led.