Automatic Adaptation of Knowledge Structures for Assisted Information Seeking (AutoAdapt)

Lead Research Organisation: Robert Gordon University

Department Name: School of Computing

Abstract

A massive number of electronic document collections exist within companies, universities and other institutions. Two common forms of information seeking are searching and exploring (browsing) the collections. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and less specific queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for union . This query would trigger the search system to offer query refinement terms such as students union and european union . Indeed, all local Web sites, intranets and similar collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. The challenge is to automatically acquire a domain model and then make it usable by assisting users in information seeking tasks such as searching or browsing. An even bigger challenge is to evolve this domain model automatically. The novelty of this proposal lies in evolving automatically acquired domain knowledge by observing users' usage of it and altering it accordingly. We hypothesize that the submitted user queries and the dialogues between users and search system can be monitored and used to improve the domain model over time. A user's selection of a query modification suggestion is taken as an indication of relevance. This can then be used to update the domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions.This project aims to develop and evaluate methods for adapting automatically constructed domain models to the population of users' search or browsing behaviour. Application and large-scale evaluation of the developed methods in two information seeking scenarios - namely, interactive search and browsing - will be performed on a number of domains including the intranets of the Essex University, the Open University and our industrial partners.

Funded Value:

£325,055

Funded Period:

Dec 08 - May 12

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/F035705/1

Principal Investigator:

Dawei Song

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (25%)

Information & Knowledge Mgmt (75%)

Organisations

People	ORCID iD
Dawei Song (Principal Investigator)
Anne De Roeck (Co-Investigator)

Publications

Author Name Title Publication Date Published

|< < 1 2 3 > >|

10 25 50

Zhu J (2009) Integrating multiple document features in language models for expert finding in Knowledge and Information Systems

Zhu J (2009) Integrating multiple windows and document features for expert finding in Journal of the American Society for Information Science and Technology

Liu H (2010) Applying information foraging theory to understand user interaction with content-based image retrieval

Cerviño Beresi U (2010) Research and Advanced Technology for Digital Libraries

Cerviño Beresi U (2010) Advances in Information Retrieval

Beresi U (2010) Why did you pick that? Visualising relevance criteria in exploratory search in International Journal on Digital Libraries

Dignum S (2010) Incorporating Seasonality into Search Suggestions Derived from Intranet Query Logs

Kaliciak L (2010) Novel local features with hybrid sampling technique for image retrieval

Yan X (2011) Toward a semantic granularity model for domain-specific information retrieval in ACM Transactions on Information Systems

Albakour M (2011) Information Retrieval Technology

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Collaboration
Software and Technical Products


Description	A massive number of electronic document collections exist within companies, universities and other institutions. To access such collections users typically search or explore (browse) them. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and underspecified queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for "union". This query could trigger the search system to offer query refinement terms such as "students union", "trade union" and "european union". Web search engines have recently started to move towards more and more interactive search assistance but such features are still not very common on Web sites, in digital libraries and enterprise search applications. The AutoAdapt project has made some significant progress in addressing this issue. The type of document collections we focussed on in the AutoAdapt project, i.e. local Web sites, intranets, digital libraries and similar collections, typically do not come with explicitly encoded knowledge that can be used for interactive search support. However, any of these collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. For example, a user's selection of a query modification suggestion can be taken as an indication of relevance. This can then be used to update the (previously acquired) domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions. The first challenge is to automatically acquire a usable domain model. An even bigger challenge is to evolve this domain model automatically. In the AutoAdapt project we systematically studied a variety of adaptive learning approaches applied to domain models by exploiting the actual documents, query logs as well as different forms of implicit relevance feedback. Through this research we have demonstrated that methods such as ant colony optimisation, click graphs and formal concept analysis can be applied to learn domain models that improve over time. We also developed a novel methodology that combines both the document collection at hand as well as query logs which leads to a model that is significantly better than each of the approaches applied individually. Apart from developing methods for domain model adaptation we systematically evaluated any such approach by using a variety of commonly used evaluation techniques and by participating in the relevant competitions such as TREC and CLEF. We also developed a novel evaluation framework (AutoEval) that will help researchers in evaluating interactive search applications, an area that is much less developed than standard ad hoc search.
Exploitation Route	We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT.
First Year Of Impact	2013
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Economic


Description	Marie Curie Innovative Training Networks
Amount	€ 3,460,000 (EUR)
Funding ID	721321
Organisation	European Commission
Sector	Public
Country	European Union (EU)
Start	01/2017
End	12/2020


Description	NRP studentship project: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment
Amount	£25,908 (GBP)
Organisation	Northern Research Partnership
Sector	Academic/University
Country	United Kingdom
Start	09/2009
End	07/2012


Title	A hybrid approach for construction and adaptation of domain knowledge structures
Description	A hybrid approach for automatic construction and adaptation of domain knowledge structures from text documents and user query logs.
Type Of Material	Computer model/algorithm
Provided To Others?	No
Impact	The approach has been applied to Essex University intranet query log data for recommendation of query term suggestions.


Description	NRP PhD studentship project
Organisation	Aston University
Department	Computer Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment.
Collaborator Contribution	University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen.
Impact	Leszek Kaliciak completion of his PhD in 2013.
Start Year	2009


Description	NRP PhD studentship project
Organisation	Yahoo!
Department	Yahoo! Research
Country	United States
Sector	Private
PI Contribution	In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment.
Collaborator Contribution	University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen.
Impact	Leszek Kaliciak completion of his PhD in 2013.
Start Year	2009


Title	Autoadapt web based demonstration system
Description	A Web-based application that provides search and adaptation services for intranets. It is built as a Web application which can be used a standalone system or integrated as a Web service in other applications.
Type Of Technology	Webtool/Application
Year Produced	2011
Impact	It was used as the Essex University's Intranet search engine.
URL	http://autoadaptproject.org

Abstract

Organisations

People

ORCID iD

Publications