Document Triage in the Information Seeking Process

Lead Research Organisation: City, University of London
Department Name: Centre for HCI Design

Abstract

When users are searching for information, they quickly read documents to assess their usefulness. This brief action / termed document triage / will often be undertaken many times in the pursuit of finding enough information to satisfy one particular need. The user may attempt different searches through a system such as Google, or browse a catalogue of documents, but in either case, they will regularly need to open a digital document such as a web page or electronic book (PDF) to evaluate its real relevance. Experimental evidence demonstrates that the judgements made in these short reviews of documents play a significant role in the overall success of a user's information seeking. There are frequent errors made in rejecting relevant material, and selecting irrelevant material for further reading. Only substantial further effort by the user, in wasted close reading of unhelpful material and repeated attempts to locate relevant information will finally achieve a sufficiently successful outcome for most searches. This is clearly inefficient. This project will scrutinise the impact that the interaction between the user and the document display software has on the user's decision making during this process. Driven by the goal to optimise the outcome of human effort, it will examine the positive and negative factors that affect the quality of relevance decisions during the initial reading of a document.While users are reading a document primarily to assess its relevance, there are many other events that can be triggered at the same time. New information needs can be identified, answers to another question can be unexpectedly found, and alternative terminology for a new search discovered. These secondary activities can have dramatic impacts on the user's search plans, but there is limited understanding of the related needs of users. In physical environments, these goals are usually tracked using notebooks, scraps of paper and human memory. Digital information-seeking tools provide no replacement for these tools, and thus existing practice is usually continued alongside electronic searching.Motivated by this gap, this research will refine our understanding of goal-tracking during information seeking. This improved knowledge will be used to design provide useful digital tools to support this work that integrate with existing information seeking facilities such as search engines and document organisation software.There are a number of challenges to address in achieving these objectives. Relevant data, from which good models of human behaviour can be built, is in short supply. Though related areas, such as the detailed reading of longer texts, have been studied, these findings, and the data that underpins them, cannot be applied directly. User observations of human searchers in physical environments provide insights that may prove false when using computer software. The users of paper-based books are themselves limited by the properties of the books they use. New methods may be available digitally that paper cannot supply, and digital reproductions of paper-based behaviour may prove ineffective due to differences between the two mediums.At present, software designers can only intuit the best interactions for the document reader applications to use. These software tools were often originally designed to support the download and printing of documents, and already include features that support the systematic reading of longer documents. Scientists currently lack the theoretical insights to recommend effective solutions for supporting the triage reading of documents.Document triage plays a central role in the decision-making processes that drive information seeking. A deeper understanding of document triage will make a significant contribution to the ongoing development of scientific understanding of the information seeking process as a whole

Publications

10 25 50
 
Description We discovered that information triage - where users filter out information that is irrelevant during an information seeking process - is a complex activity comprised of three main steps. We focussed on the second step, which was the least researched, and discovered the factors and document properties (e.g. title, layout etc.) that are most important in influencing acceptance. We also created a model of the process - the first time this had been constructed.
Exploitation Route Our work points to the importance of the third stage of the process, where users reject a document after it has been read in fully for the first time. We focussed on the second stage in filtering information, where users simply read it quickly. We discovered that document layout and format are important, and information is only reliably read if in particular places.

While the third stage has been studied before, it now needs to be re-examined to see what our research might reveal about it, and also our model may help focus that work.

In addition, it is critical that information retrieval tools model the fact that readers' attention is selective, and existing "whole document" models are a poor match with how users will determine relevance at first sight.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Financial Services, and Management Consultancy,Healthcare,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections

 
Description The insights from this project are now being used at the University of Melbourne in the context of further work on how users triage information in the context of browsing tasks. This work is being done by Assoc. Prof. Shanton Chang, Dr. Wally Smith and Dana McKay. This has in turn been used to inform the design of the services of the State Library of Victoria and Swinburne University of Technology's library service.
First Year Of Impact 2017
Sector Education,Culture, Heritage, Museums and Collections
Impact Types Societal