Automatic Adaptation of Knowledge Structures for Assisted Information Seeking (AutoAdapt)

Lead Research Organisation: Robert Gordon University
Department Name: School of Computing

Abstract

A massive number of electronic document collections exist within companies, universities and other institutions. Two common forms of information seeking are searching and exploring (browsing) the collections. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and less specific queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for union . This query would trigger the search system to offer query refinement terms such as students union and european union . Indeed, all local Web sites, intranets and similar collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. The challenge is to automatically acquire a domain model and then make it usable by assisting users in information seeking tasks such as searching or browsing. An even bigger challenge is to evolve this domain model automatically. The novelty of this proposal lies in evolving automatically acquired domain knowledge by observing users' usage of it and altering it accordingly. We hypothesize that the submitted user queries and the dialogues between users and search system can be monitored and used to improve the domain model over time. A user's selection of a query modification suggestion is taken as an indication of relevance. This can then be used to update the domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions.This project aims to develop and evaluate methods for adapting automatically constructed domain models to the population of users' search or browsing behaviour. Application and large-scale evaluation of the developed methods in two information seeking scenarios - namely, interactive search and browsing - will be performed on a number of domains including the intranets of the Essex University, the Open University and our industrial partners.

Publications

10 25 50
publication icon
Zhu J (2009) Integrating multiple document features in language models for expert finding in Knowledge and Information Systems

publication icon
Zhu J (2009) Integrating multiple windows and document features for expert finding in Journal of the American Society for Information Science and Technology

publication icon
Zhang P (2011) Developing Position Structure-Based Framework for Chinese Entity Relation Extraction in ACM Transactions on Asian Language Information Processing

publication icon
Yan X (2011) Toward a semantic granularity model for domain-specific information retrieval in ACM Transactions on Information Systems

publication icon
Kruschwitz U (2013) Deriving query suggestions for site search in Journal of the American Society for Information Science and Technology

 
Description A massive number of electronic document collections exist within companies, universities and other institutions. To access such collections users typically search or explore (browse) them. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and underspecified queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for "union". This query could trigger the search system to offer query refinement terms such as "students union", "trade union" and "european union". Web search engines have recently started to move towards more and more interactive search assistance but such features are still not very common on Web sites, in digital libraries and enterprise search applications. The AutoAdapt project has made some significant progress in addressing this issue.

The type of document collections we focussed on in the AutoAdapt project, i.e. local Web sites, intranets, digital libraries and similar collections, typically do not come with explicitly encoded knowledge that can be used for interactive search support. However, any of these collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. For example, a user's selection of a query modification suggestion can be taken as an indication of relevance. This can then be used to update the (previously acquired) domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions. The first challenge is to automatically acquire a usable domain model. An even bigger challenge is to evolve this domain model automatically. In the AutoAdapt project we systematically studied a variety of adaptive learning approaches applied to domain models by exploiting the actual documents, query logs as well as different forms of implicit relevance feedback. Through this research we have demonstrated that methods such as ant colony optimisation, click graphs and formal concept analysis can be applied to learn domain models that improve over time. We also developed a novel methodology that combines both the document collection at hand as well as query logs which leads to a model that is significantly better than each of the approaches applied individually.

Apart from developing methods for domain model adaptation we systematically evaluated any such approach by using a variety of commonly used evaluation techniques and by participating in the relevant competitions such as TREC and CLEF. We also developed a novel evaluation framework (AutoEval) that will help researchers in evaluating interactive search applications, an area that is much less developed than standard ad hoc search.
Exploitation Route We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT.
First Year Of Impact 2013
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Description Marie Curie Innovative Training Networks
Amount € 3,460,000 (EUR)
Funding ID 721321 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2017 
End 12/2020
 
Description NRP studentship project: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment
Amount £25,908 (GBP)
Organisation Northern Research Partnership 
Sector Academic/University
Country United Kingdom
Start 10/2009 
End 07/2012
 
Title A hybrid approach for construction and adaptation of domain knowledge structures 
Description A hybrid approach for automatic construction and adaptation of domain knowledge structures from text documents and user query logs. 
Type Of Material Computer model/algorithm 
Provided To Others? No  
Impact The approach has been applied to Essex University intranet query log data for recommendation of query term suggestions. 
 
Description NRP PhD studentship project 
Organisation Aston University
Department Computer Science
Country United Kingdom 
Sector Academic/University 
PI Contribution In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment.
Collaborator Contribution University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen.
Impact Leszek Kaliciak completion of his PhD in 2013.
Start Year 2009
 
Description NRP PhD studentship project 
Organisation Yahoo!
Department Yahoo! Research
Country United States 
Sector Private 
PI Contribution In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment.
Collaborator Contribution University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen.
Impact Leszek Kaliciak completion of his PhD in 2013.
Start Year 2009
 
Title Autoadapt web based demonstration system 
Description A Web-based application that provides search and adaptation services for intranets. It is built as a Web application which can be used a standalone system or integrated as a Web service in other applications. 
Type Of Technology Webtool/Application 
Year Produced 2011 
Impact It was used as the Essex University's Intranet search engine. 
URL http://autoadaptproject.org