Automatic Adaptation of Knowledge Structures for Assisted Information Seeking (AutoAdapt)
Lead Research Organisation:
Robert Gordon University
Department Name: School of Computing
Abstract
A massive number of electronic document collections exist within companies, universities and other institutions. Two common forms of information seeking are searching and exploring (browsing) the collections. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and less specific queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for union . This query would trigger the search system to offer query refinement terms such as students union and european union . Indeed, all local Web sites, intranets and similar collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. The challenge is to automatically acquire a domain model and then make it usable by assisting users in information seeking tasks such as searching or browsing. An even bigger challenge is to evolve this domain model automatically. The novelty of this proposal lies in evolving automatically acquired domain knowledge by observing users' usage of it and altering it accordingly. We hypothesize that the submitted user queries and the dialogues between users and search system can be monitored and used to improve the domain model over time. A user's selection of a query modification suggestion is taken as an indication of relevance. This can then be used to update the domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions.This project aims to develop and evaluate methods for adapting automatically constructed domain models to the population of users' search or browsing behaviour. Application and large-scale evaluation of the developed methods in two information seeking scenarios - namely, interactive search and browsing - will be performed on a number of domains including the intranets of the Essex University, the Open University and our industrial partners.
People |
ORCID iD |
Dawei Song (Principal Investigator) | |
Anne De Roeck (Co-Investigator) |
Publications
Zhu J
(2009)
Integrating multiple document features in language models for expert finding
in Knowledge and Information Systems
Zhu J
(2009)
Integrating multiple windows and document features for expert finding
in Journal of the American Society for Information Science and Technology
Cerviño Beresi U
(2010)
Research and Advanced Technology for Digital Libraries
Cerviño Beresi U
(2010)
Advances in Information Retrieval
Beresi U
(2010)
Why did you pick that? Visualising relevance criteria in exploratory search
in International Journal on Digital Libraries
Kaliciak L
(2010)
Novel local features with hybrid sampling technique for image retrieval
Yan X
(2011)
Toward a semantic granularity model for domain-specific information retrieval
in ACM Transactions on Information Systems
Albakour M
(2011)
Information Retrieval Technology
Description | A massive number of electronic document collections exist within companies, universities and other institutions. To access such collections users typically search or explore (browse) them. However, finding relevant information within such collections can be difficult. This is true for searching with poorly formulated and underspecified queries as well as for browsing where the user may not have a specific target to search. The user's information seeking could be assisted by well-structured knowledge about the search domain, which we refer to as domain model. A domain model is effectively a structure that people impose on data to support them in information seeking. We can now derive query modification or browsing suggestions directly from the domain model. To illustrate the point using a realistic example, assume a user of the University of Essex intranet started by searching for "union". This query could trigger the search system to offer query refinement terms such as "students union", "trade union" and "european union". Web search engines have recently started to move towards more and more interactive search assistance but such features are still not very common on Web sites, in digital libraries and enterprise search applications. The AutoAdapt project has made some significant progress in addressing this issue. The type of document collections we focussed on in the AutoAdapt project, i.e. local Web sites, intranets, digital libraries and similar collections, typically do not come with explicitly encoded knowledge that can be used for interactive search support. However, any of these collections do contain a huge amount of valuable domain knowledge that is encoded implicitly. For example, a user's selection of a query modification suggestion can be taken as an indication of relevance. This can then be used to update the (previously acquired) domain knowledge and thus help the next user with a similar query by presenting updated query modification suggestions. The first challenge is to automatically acquire a usable domain model. An even bigger challenge is to evolve this domain model automatically. In the AutoAdapt project we systematically studied a variety of adaptive learning approaches applied to domain models by exploiting the actual documents, query logs as well as different forms of implicit relevance feedback. Through this research we have demonstrated that methods such as ant colony optimisation, click graphs and formal concept analysis can be applied to learn domain models that improve over time. We also developed a novel methodology that combines both the document collection at hand as well as query logs which leads to a model that is significantly better than each of the approaches applied individually. Apart from developing methods for domain model adaptation we systematically evaluated any such approach by using a variety of commonly used evaluation techniques and by participating in the relevant competitions such as TREC and CLEF. We also developed a novel evaluation framework (AutoEval) that will help researchers in evaluating interactive search applications, an area that is much less developed than standard ad hoc search. |
Exploitation Route | We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | We expect some further significant impact of the project. So far there are two main outcomes, first of all the publication of our research in the main research outlets of the areas we were working in (involving a number of papers with international collaborators and PhD students). Furthermore, the University of Essex Web site search is now based on the search framework developed within the project which incorporates an adaptive domain model learning component. In addition to that, we will release the implementations of a selection of adaptive models as an open source project. We are also currently trialling (in a live trial) a version of our prototype in collaboration with BT. |
First Year Of Impact | 2013 |
Sector | Digital/Communication/Information Technologies (including Software) |
Impact Types | Economic |
Description | Marie Curie Innovative Training Networks |
Amount | € 3,460,000 (EUR) |
Funding ID | 721321 |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 01/2017 |
End | 12/2020 |
Description | NRP studentship project: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment |
Amount | £25,908 (GBP) |
Organisation | Northern Research Partnership |
Sector | Academic/University |
Country | United Kingdom |
Start | 09/2009 |
End | 07/2012 |
Title | A hybrid approach for construction and adaptation of domain knowledge structures |
Description | A hybrid approach for automatic construction and adaptation of domain knowledge structures from text documents and user query logs. |
Type Of Material | Computer model/algorithm |
Provided To Others? | No |
Impact | The approach has been applied to Essex University intranet query log data for recommendation of query term suggestions. |
Description | NRP PhD studentship project |
Organisation | Aston University |
Department | Computer Science |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment. |
Collaborator Contribution | University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen. |
Impact | Leszek Kaliciak completion of his PhD in 2013. |
Start Year | 2009 |
Description | NRP PhD studentship project |
Organisation | Yahoo! |
Department | Yahoo! Research |
Country | United States |
Sector | Private |
PI Contribution | In the context of our EPSRC Autoadapt and renaissance projects, we collaborated with University of Aberdeen and Yahoo Research in Barcelona in successfully getting a new PhD studentship project from the Northern Research partnership. The PhD project title is: Hybrid User Profiling and Adaptation for Personalised Search in Social Media Environment. |
Collaborator Contribution | University of Aberdeen and Yahoo Research in Barcelona supported our NRP studentship project application. The student was co-supervised by Dr Jeff Pan of University of Aberdeen. |
Impact | Leszek Kaliciak completion of his PhD in 2013. |
Start Year | 2009 |
Title | Autoadapt web based demonstration system |
Description | A Web-based application that provides search and adaptation services for intranets. It is built as a Web application which can be used a standalone system or integrated as a Web service in other applications. |
Type Of Technology | Webtool/Application |
Year Produced | 2011 |
Impact | It was used as the Essex University's Intranet search engine. |
URL | http://autoadaptproject.org |