Novel statistical methods for the design and analysis of proteomic experiments for biomarker discovery

Lead Research Organisation: University of Leeds
Department Name: Leeds Institute of Molecular Medicine

Abstract

There is an urgent need for new medical tests which can be used to diagnose a disease or to determine how severe a disease is. Such tests can result in earlier and better treatment, increase the chance of recovery and help select the best treatment. A lot of medical research is looking in body fluids such as blood and urine to find proteins (one of the key type of molecules which make up cells and tissues of the body) which could act as markers of disease and be measured in a simple test. Such research uses quite complex technologies and produces very large amounts of data and the purpose of this proposal is to develop the best methods of statistical analysis for this type of data. This is essential so that we reach the right conclusions about whether a marker is a good or bad one.

Technical Summary

Aims:
To develop novel statistical methods for the experimental design, monitoring and analysis of proteomic profiling experiments (PPE) in biomarker discovery research.

Specific objectives:
1) To develop approaches for statistical experimental design to perform PPE efficiently.
2) To extend sample size calculation techniques to experimental designs for the discovery of therapeutic and prognostic biomarkers.
3) To adapt quality control methodology to be applied to 2-dimensional difference in gel electrophoresis (2D DIGE) and liquid chromatography and mass spectrometry (MS) tagging (iTRAQ) and labelling (SILAC) techniques.
4) To construct flexible statistical models for the analysis of PPE which include elements of pre-processing and salient experimental and differential expression effects.
5) To compare and contrast both Bayesian and classical inference techniques as appropriate and to contrast and compare results.

Methodology:
These objectives will be fulfilled by applying and developing modern statistical techniques of design and analysis to maximise the potential of PPE. This will involve the use of methods in statistical experimental design such as blocking and cycling. In conjunction with this methods for sample size calculations for PPE will be constructed for designs which can be used in the detection of therapeutic and diagnostic biomarkers, correctly allowing for issues of multiple testing and variance estimation. Each of these methods will be extended from MS based PPE to 2D DIGE and labelling techniques such as iTRAQ and SILAC. Careful use of the similarities (and dissimilarities) between the technical and biological aspects of these PPE technologies and experimental process will be used to construct statistical models and techniques from an underlying model. Methods of classical and Bayesian inference will be used to answer clinically relevant questions.

Scientific and medical opportunities:
The effectiveness of costly biomedical research may be maximised by increasing the level of statistical input. This is particularly true at the discovery stage of an investigation. A false positive result can lead to further wasted resources in ultimately negative validation whereas a false negative result could lead to the non-identification of an important biological finding. This is the case in biomarker discovery research where well-designed powerful experiments in conjunction with modern statistical analysis techniques can maximise the potential of such studies and hence deliver clinically useful biomarkers.

Publications

10 25 50