End-to-end integrated Statistical processing for Context-aware dialogue systems

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Informatics

Abstract

This project targets a new processing paradigm for the development andoptimization of spoken dialogue systems (SDS) that are context-aware,efficient, and most importantly robust to the uncertainty thatpervades natural language. We will develop tractable and effectivetechniques for the integrated end-to-end treatment of uncertainty incontext-aware SDS, using learning algorithms combinedwith Partially Observable Markov Decision Processes (POMDPs). Thisrequires us to develop effective methods for training and testing suchsystems. We will also determine, through system tests withreal users, whether the end-to-end statistical treatment ofuncertainty improves SDS for users, in comparison to rulebased and standard MDP-based techniques.No current SDS treats dialogue processing as an end-to-endintegrated statistical system, constrained by context, whereuncertainty in one process feeds into other processes, whereuncertainty in one dialogue state feeds into the nextdialogue state, and where this whole system is constrained viacontextual feedback. It is still standard practice to ignore theuncertainty in the output of a lower-level process by passing only asingle best analysis to higher-level processes, with the sideeffect that lower-level processes do not take into account importanthigh-level constraints. For example, contextual features ofdialogues such as user goals or previous speech acts are notsystematically exploited in speech recognition or utteranceinterpretation. This is a serious shortcoming for current SDS, given that uncertainty pervades and proliferates throughevery level of dialogue processing, from speech recognition errorsthrough interpretation ambiguities, to uncertain dialogue states andcompeting strategies. These problems lead to the currentsituation where SDS are not robust or efficient enoughfor any but very simple tasks.We will build and evaluate SDS which usestatistical processing end-to-end, and which use contextrepresentations to constrain the uncertainty inherent in dialogue. Wewill build on exisiting knowledge and techniques developed in the TALKproject, and well as recent corpora (COMMUNICATOR, TALK, AMI). TheSDS development tools, components, and environments usedand developed at Edinburgh's HCRC (e.g. DIPPER, HTK, Festival) alsoprovide a number of exisiting dialogue systems (FLIGHTS, TALK, WITAS), forming a platform to be extended usingthe new methods developed in the project. These systems can then beused for testing, evaluation, and further data collection.The proposal thus aims to improve dialogue system robustness andefficiency, and allow SDS to be developed and optimized usingdata-driven approaches. There is much user frustration with currently deployed SDS, so there is much to be gained from improved robustness andefficiency. Data-driven optimization will also lead to decreaseddeployment and development costs for industry. Thus the beneficiaries ofthis research will potentially be all futureusers of IT (including the illiterate andIT-illiterate, also in the developing world). In the short tomedium term, commercial applications include: interactive SDS, dialogue and meeting summarisation, interactiveentertainment, intelligent tutoring systems, intelligent personalassistants, and dialogue supported question-answering and search.With recent advances in speech recognition, parsing, context-sensitivestatistical dialogue management, the theory of learning with PartiallyObservable states, the availability of new,large, and richly annotated dialogue corpora, we are now in a position to treat dialogueprocessing as an end-to-end context-aware statistical system. Webelieve this model will lead to a breakthough in robust, efficient, and natural human-computer SDS, andhas the potential to radically improve the state-of-the-art indialogue management.

Funded Value:

£266,906

Funded Period:

Jan 07 - Mar 09

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/E019501/1

Principal Investigator:

Oliver Lemon

Research Subject:

Info. & commun. Technol. (20%)

Linguistics (80%)

Research Topic:

Artificial Intelligence (20%)

Comput./Corpus Linguistics (80%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
Oliver Lemon (Principal Investigator)
James Henderson (Researcher Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 > >|

10 25 50

V Rieser (2008) Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz data: Bootstrapping and Evaluation in ACL

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Software and Technical Products


Description	This project targeted a new processing paradigm for the development and optimization of spoken dialogue systems (SDS) that are context-aware, efficient, and most importantly robust to the uncertainty that pervades natural language. We published 5 journal papers, 8 conference papers, and 5 workshop papers on these issues (all peer reviewed). We also gave several invited talks on this research (e.g. Oxford University 2008, King's College London 2008, International Conference on Human-Computer Conversation 2008). We developed tractable and effective techniques for the integrated end-to-end treatment of uncertainty in context-aware SDS, using learning algorithms combined with Partially Observable Markov Decision Processes (POMDPs). This required us to develop effective methods for training and testing such systems. We also investigated, through system tests in more than 400 test dialogues, whether this end-to-end statistical treatment of uncertainty improves SDS for end users. We have shown that our proposed model has specific benefits: for example a 5% reduction in Word Error Rates using a memory-based learning context-sensitive speech recognition method (Lemon & Konstas EACL 2009). This is an example of reducing uncertainty in the output of a lower-level process by taking into account high-level contextual constraints in an end-to-end statistical model. The project achieved its planned milestones (M1 - M4). The first end-to-end statistical system was built by month 10 (the "Q-MDP system": M1). This system was evaluated in simulation and with real users at the start of year 2 (M3), and year 2 then focussed on the creation of a POMDP SDS within this framework (M2). This system was completed towards the end of 2008, and refined and evaluated (in simulation and with real users: M4) in early 2009. In summary, we built and evaluated one of the first SDS which used statistical processing end-to-end, and which used context representations to constrain the uncertainty inherent in dialogue. We have shown that this model produces more robust and efficient SDS, especially as noise increases. We have therefore shown how to improve dialogue system robustness and efficiency, and developed a methodology allowing SDS to be developed and optimized using data-driven approaches. We have shown that such an approach can ultimately lead to a breakthrough in robust, efficient, and natural human-computer SDS, and has the potential to improve the state-of-the-art in dialogue management.
Exploitation Route	Applications of the this work include: interactive spoken dialogue systems, dialogue and meeting summarisation, interactive entertainment, intelligent tutoring systems, intelligent personal assistants, and dialogue supported question-answering and search on mobile devices -- such as Apple's Siri.
Sectors	Digital/Communication/Information Technologies (including Software)
URL	https://sites.google.com/site/epsrcstatdial/


Description	Applications of the this work include: interactive spoken dialogue systems, dialogue and meeting summarisation, interactive entertainment, intelligent tutoring systems, intelligent personal assistants, and dialogue supported question-answering and search on mobile devices -- such as Apple's Siri, Microsoft's Cortana, and Google Now. Our findings form some of the fundamental technological approaches for such systems.
First Year Of Impact	2008
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Societal,Economic


Description	EC FP7 ICT grant: SpaceBook
Amount	€ 645,000 (EUR)
Funding ID	270019
Organisation	European Commission
Sector	Public
Country	European Union (EU)
Start	03/2011
End	02/2014


Description	EC FP7 ICT project: JAMES: Joint Action for Multimodal Embodied Social Systems
Amount	€ 3,209,918 (EUR)
Funding ID	270435
Organisation	European Commission
Sector	Public
Country	European Union (EU)
Start	02/2011
End	09/2014


Title	End-to-End statistical dialogue management algorithms
Description	Collection of algorithms for end-to-end statistical processing in automated spoken dialogue systems. See project publications for details
Type Of Material	Computer model/algorithm
Year Produced	2010
Provided To Others?	Yes
Impact	Use in future EC FP7 projects such as SpaceBook, JAMES, PARLANCE.


Title	Spoken dialogue data -- end-to-end
Description	Spoken dialogue data collected in the evaluation of the end-to-end statistical systems
Type Of Material	Database/Collection of data
Provided To Others?	No
Impact	Data used in follow-up EPSRC project and also EC FP7 projects such as SpaceBook and PARLANCE


Title	ABC demonstration / evaluation automated spoken dialogue system (telephone-based)
Description	Working automated spoken dialogue system used in project demonstrations (see publications) and for system evaluations with real users, in telephone-based interactions.
Type Of Technology	Webtool/Application
Year Produced	2012
Impact	Platform used in further research in EC FP7 projects such as PARLANCE and SpaceBook.
URL	https://sites.google.com/site/abcpomdp/home

Abstract

Organisations

People

ORCID iD

Publications