A Computational Information Geometry Approach to Sensitivity Analysis in Statistical Science

Lead Research Organisation: Open University
Department Name: Mathematics & Statistics

Abstract

A Computational Information Geometry Approach to Sensitivity Analysis in Statistical ScienceStatistical Science.Human inquisitiveness is a fundamental driving force for progress. 'What's the pattern?', 'How can I understand, predict, control events?', 'How can I make things better?', ... questions like these drive forward scientific enquiry. Statistical science is the name given to that part of scientific enquiry where the data contain a random (non-deterministic) component.Sensitivity Analysis.This project concerns sensitivity analysis in statistical science - the study of how scientifically relevant changes in the way we formulate problems affect answers to our questions of interest. There are two, interacting, aspects to such changes, associated with models and data respectively:1. In statistical science, the complex mechanism generating the data is described by some simpler, but still realistic, model highlighting specific aspects of interest. Such models are not 'true' in any absolute sense. Rather, all models are wrong but some are useful. Viewing a model as a collection of assumptions approximating the real world, it makes sense to explore a range of scientifically reasonable models around the currently accepted one.And:2. Exploration of the data can reveal unanticipated features. These may reflect previously unknown structure, requiring us to elaborate our models. Or, again, subsets of observations having disproportionately large influence on our results, requiring reference back to the scientist to determine if they are highly informative cases meriting further study or, at the other extreme, the result of measurement or recording errors requiring down-weighting or deletion.Since such changes in problem formulation are pertinent, sensitivity analyses are sensible. Overall scientific motivation is, thus, threefold:- Stability: If small changes have small effects, you gain the reassurance of stability.- Robustness: If large changes have small effects, you know your analysis is robust.- Warning: If small changes have large effects, you want to know about it! Information Geometry.This project takes a powerful new look at sensitivity analysis in statistical science based on information geometry. This is a modern approach to statistics emphasising the importance of reaching invariant conclusions -- i.e. those which don't depend on scientifically irrelevant choices, such as measuring lengths in inches or centimetres.Unification, Extension and Exploration.Currently, there are several approaches to sensitivity analysis and associated geometric structures but, still, important classes of scientific problems not amenable to treatment by any of them. This project will develop and exploit the latest, third-generation, methods of information geometry so as to:- unify these seemingly divergent approaches within an overall framework, - extend them to cover a wider range of practically important problems, and- explore ways of profitably combining different approaches.Delivery.Delivery to the general statistical practitioner will be via a coherent set of software tools, designed so that the geometry is essentially hidden from the user. In this way, the benefits of the excellent properties enjoyed by our new methodology will be accessible without needing to master the underlying geometry. Focus.Generalised linear models have been chosen as the focus of this work because of their excellent theoretical properties and widespread practical use. The workhorse of applied statistics for the last two decades, they are flexible enough to cover a very wide range of commonly encountered situations and have shown their value from archaeology to zoology via bioinformatics, engineering and finance. Beneficiaries.The proposed project will benefit both researchers and practitioners in three active communities: sensitivity analysis, information geometry and generalised linear models.

Publications

10 25 50
 
Description This grant established the field of Computational Information Geometry (CIG), whereby differential and other forms of geometry can be used to provide novel contributions applied to pressing practical problems in statistics.

In particular, it produced an operational universal space of 'all possible models', providing a novel, operational, solution to a longstanding problem.
Exploitation Route Subsequent, on-going, publications detail how this work can be taken forward in a variety of directions and domains, from further fundamental theory to hands-on practical application
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Financial Services, and Management Consultancy,Other