End-to-end integrated Statistical processing for Context-aware dialogue systems

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

This project targets a new processing paradigm for the development andoptimization of spoken dialogue systems (SDS) that are context-aware,efficient, and most importantly robust to the uncertainty thatpervades natural language. We will develop tractable and effectivetechniques for the integrated end-to-end treatment of uncertainty incontext-aware SDS, using learning algorithms combinedwith Partially Observable Markov Decision Processes (POMDPs). Thisrequires us to develop effective methods for training and testing suchsystems. We will also determine, through system tests withreal users, whether the end-to-end statistical treatment ofuncertainty improves SDS for users, in comparison to rulebased and standard MDP-based techniques.No current SDS treats dialogue processing as an end-to-endintegrated statistical system, constrained by context, whereuncertainty in one process feeds into other processes, whereuncertainty in one dialogue state feeds into the nextdialogue state, and where this whole system is constrained viacontextual feedback. It is still standard practice to ignore theuncertainty in the output of a lower-level process by passing only asingle best analysis to higher-level processes, with the sideeffect that lower-level processes do not take into account importanthigh-level constraints. For example, contextual features ofdialogues such as user goals or previous speech acts are notsystematically exploited in speech recognition or utteranceinterpretation. This is a serious shortcoming for current SDS, given that uncertainty pervades and proliferates throughevery level of dialogue processing, from speech recognition errorsthrough interpretation ambiguities, to uncertain dialogue states andcompeting strategies. These problems lead to the currentsituation where SDS are not robust or efficient enoughfor any but very simple tasks.We will build and evaluate SDS which usestatistical processing end-to-end, and which use contextrepresentations to constrain the uncertainty inherent in dialogue. Wewill build on exisiting knowledge and techniques developed in the TALKproject, and well as recent corpora (COMMUNICATOR, TALK, AMI). TheSDS development tools, components, and environments usedand developed at Edinburgh's HCRC (e.g. DIPPER, HTK, Festival) alsoprovide a number of exisiting dialogue systems (FLIGHTS, TALK, WITAS), forming a platform to be extended usingthe new methods developed in the project. These systems can then beused for testing, evaluation, and further data collection.The proposal thus aims to improve dialogue system robustness andefficiency, and allow SDS to be developed and optimized usingdata-driven approaches. There is much user frustration with currently deployed SDS, so there is much to be gained from improved robustness andefficiency. Data-driven optimization will also lead to decreaseddeployment and development costs for industry. Thus the beneficiaries ofthis research will potentially be all futureusers of IT (including the illiterate andIT-illiterate, also in the developing world). In the short tomedium term, commercial applications include: interactive SDS, dialogue and meeting summarisation, interactiveentertainment, intelligent tutoring systems, intelligent personalassistants, and dialogue supported question-answering and search.With recent advances in speech recognition, parsing, context-sensitivestatistical dialogue management, the theory of learning with PartiallyObservable states, the availability of new,large, and richly annotated dialogue corpora, we are now in a position to treat dialogueprocessing as an end-to-end context-aware statistical system. Webelieve this model will lead to a breakthough in robust, efficient, and natural human-computer SDS, andhas the potential to radically improve the state-of-the-art indialogue management.

Publications

10 25 50
 
Description This project targeted a new processing paradigm for the development and optimization of spoken dialogue systems (SDS) that are context-aware, efficient, and most importantly robust to the uncertainty that pervades natural language. We published 5 journal papers, 8 conference papers, and 5 workshop papers on these issues (all peer reviewed). We also gave several invited talks on this research (e.g. Oxford University 2008, King's College London 2008, International Conference on Human-Computer Conversation 2008).

We developed tractable and effective techniques for the integrated end-to-end treatment of uncertainty in context-aware SDS, using learning algorithms combined with Partially Observable Markov Decision Processes (POMDPs). This required us to develop effective methods for training and testing such systems. We also investigated, through system tests in more than 400 test dialogues, whether this end-to-end statistical treatment of uncertainty improves SDS for end users. We have shown that our proposed model has specific benefits: for example a 5% reduction in Word Error Rates using a memory-based learning context-sensitive speech recognition method (Lemon & Konstas EACL 2009). This is an example of reducing uncertainty in the output of a lower-level process by taking into account high-level contextual constraints in an end-to-end statistical model.

The project achieved its planned milestones (M1 - M4). The first end-to-end statistical system was built by month 10 (the "Q-MDP system": M1). This system was evaluated in simulation and with real users at the start of year 2 (M3), and year 2 then focussed on the creation of a POMDP SDS within this framework (M2). This system was completed towards the end of 2008, and refined and evaluated (in simulation and with real users: M4) in early 2009.

In summary, we built and evaluated one of the first SDS which used statistical processing end-to-end, and which used context representations to constrain the uncertainty inherent in dialogue. We have shown that this model produces more robust and efficient SDS, especially as noise increases.

We have therefore shown how to improve dialogue system robustness and efficiency, and developed a methodology allowing SDS to be developed and optimized using data-driven approaches.

We have shown that such an approach can ultimately lead to a breakthrough in robust, efficient, and natural human-computer SDS, and has the potential to improve the state-of-the-art in dialogue management.
Exploitation Route Applications of the this work include: interactive spoken dialogue systems, dialogue and meeting summarisation, interactive entertainment, intelligent tutoring systems, intelligent personal assistants, and dialogue supported question-answering and search on mobile devices -- such as Apple's Siri.
Sectors Digital/Communication/Information Technologies (including Software)

URL https://sites.google.com/site/epsrcstatdial/
 
Description Applications of the this work include: interactive spoken dialogue systems, dialogue and meeting summarisation, interactive entertainment, intelligent tutoring systems, intelligent personal assistants, and dialogue supported question-answering and search on mobile devices -- such as Apple's Siri, Microsoft's Cortana, and Google Now. Our findings form some of the fundamental technological approaches for such systems.
First Year Of Impact 2008
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Societal,Economic

 
Description EC FP7 ICT grant: SpaceBook
Amount € 645,000 (EUR)
Funding ID 270019 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 03/2011 
End 02/2014
 
Description EC FP7 ICT project: JAMES: Joint Action for Multimodal Embodied Social Systems
Amount € 3,209,918 (EUR)
Funding ID 270435 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 02/2011 
End 09/2014
 
Title End-to-End statistical dialogue management algorithms 
Description Collection of algorithms for end-to-end statistical processing in automated spoken dialogue systems. See project publications for details 
Type Of Material Computer model/algorithm 
Year Produced 2010 
Provided To Others? Yes  
Impact Use in future EC FP7 projects such as SpaceBook, JAMES, PARLANCE. 
 
Title Spoken dialogue data -- end-to-end 
Description Spoken dialogue data collected in the evaluation of the end-to-end statistical systems 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact Data used in follow-up EPSRC project and also EC FP7 projects such as SpaceBook and PARLANCE 
 
Title ABC demonstration / evaluation automated spoken dialogue system (telephone-based) 
Description Working automated spoken dialogue system used in project demonstrations (see publications) and for system evaluations with real users, in telephone-based interactions. 
Type Of Technology Webtool/Application 
Year Produced 2012 
Impact Platform used in further research in EC FP7 projects such as PARLANCE and SpaceBook. 
URL https://sites.google.com/site/abcpomdp/home