Operationalizing the Logical Uncertainty Principle in a Language Modelling Framework for Context-based Information Retrieval

Lead Research Organisation: Open University
Department Name: Knowledge Media Institute


The pressing need to deal with information overload has brought about the recent revolutionary emergence of context-based information retrieval (IR). IR research is experiencing a shift to context-centric approaches enabling one to infer from his/her retrieval context. I strongly believe that context-driven reasoning is the key in building more user and context-sensitive IR systems. In applications related to IR (for example, retrieving relevant documents to a given query, expanding the original query with related terms, determining the correct answer in Question-Answering, or determining an appropriate translation in cross-language retrieval), some forms of reasoning are often embedded as a result of information transformation and context dependency. As an illustration, given a query Java , an IR system may return documents about programming and documents about Merapi (a volcano in central Java island), as they all contain the term java . If the retrieval context is computer , documents about programming are relevant. However, for a volcanologist, documents about Merapi are more likely to be relevant. The above uncertainty arises from the flow of information from Java and computer to programming , or from Java and volcano to Merapi , depending on the retrieval context. It is essential to understand that current IR technology, including current search engines such as Google, does not cater for scenarios like that just given.This research is motivated by the following fundamental question: can we make the reasoning process explicit in an IR system which can in turn gain a capability of reasoning to select truly relevant information items depending on his/her retrieval context? The Logical Uncertainty Principle views IR as a plausible logical inference process and thus provides a potentially significant theoretical foundation for context-based IR. Nevertheless, its operationalization has long been a problem, due to the difficulty with obtaining the contextual and domain knowledge as well as implementing the symbolic logical models on a large scale. Recent advances in language technologies open the door to realizing the Logical Uncertainty Principle in a practical setting. Recently, language modelling frameworks have been developed for IR to integrate different types of term relationships via a smoothing mechanism. The language modelling approach provides a solid theoretical setting, produces promising experimental results (comparable to the best IR systems), and is also computationally efficient. Based on my existing work in this direction, I will investigate the operationalization of the Logical Uncertainty Principle in a language modelling framework to facilitate effective context-dependent reasoning. I will conduct theoretical research, prototyping system development, and experimental evaluation with large-scale datasets. It is my belief that the combination of the strengths of logical inference and language modelling as a new generation IR infrastructure can lead to more intelligent and context-sensitive but at the same time computationally tractable IR systems.


10 25 50

publication icon
Yuexian Hou (2010) Beyond Redundancies: A Metric-Invariant Method for Unsupervised Feature Selection in IEEE Transactions on Knowledge and Data Engineering