Automating Tissue Microarray Analysis: extending PathGrid
Lead Research Organisation:
University of Cambridge
Department Name: Institute of Astronomy
Abstract
In 2003. over 44,000 people in the UK were diagnosed with breast cancer. This is now the commonest cancer occurring in the UK. The lifetime risk of developing breast cancer is 1 in 9, and while most of the women who get breast cancer are past their menopause, almost 8,000 diagnosed each year are under 50 years old. Improving outcomes is a key challenge in treatment. High throughput genomic methods, such as expression profiling of frozen tissue samples using microarrays, have resulted in the discovery of many novel gene signatures that are correlated with clinical outcomes in cancer treatment. It is essential that these potential biomarkers are validated on large numbers of independent samples prior to clinical use. Tissue microarrays (TMA) created from paraffin-embedded tumour samples from large clinical trials are the ideal reagent for these validation experiments. However, the analysis and scoring of antibody-based markers on TMAs that may contain thousands of patient samples presents major challenges for pathology reporting and image handling. Our initial pilot PathGrid study (Oct 2007 to Oct 2008) is using a range of techniques which have been developed in astronomy to both analyse imaging data and to handle and manipulate the resulting data products. These have been applied to the challenges involved in analysing the TMA image data taken from the SEARCH study population-based study of breast cancer. We have utilised astronomy 'Virtual Observatory' components specifically adapted from those developed within the AstroGrid Virtual Observatory eScience programme (http://www.astrogrid.org), to facilitate secure data transport, resource discovery through appropriate metadata, data acquisition, ingression to a database system, and secure distributed access to those data and information resources. Image analysis has been applied to the input TMA data utilising a range of algorithms originally developed for diverse astronomical use cases. The resulting data products will in turn be interfaced (though work planned here) to the clinical trials systems developed through the CancerGrid system (see http://www.cancergrid.org). In our initial test case we have automated the scoring of Estrogen Receptor (ER) assessments. ER is an important regulator of mammary growth, but is also a key prognostic and therapeutic target in breast cancer. Assessing ER status at time of diagnosis of breast cancer, determines which treatment programmes should be followed by patients. In particular, those patients who have ER-positive breast cancer will be offered estrogen antagonist therapies such as tamoxifen. At CR-UK, breast cancer studies utilising genomics tools are underway to validate existing and new prognostic and/or predictive markers. TMAs have been created from a large population-based clinical trial (SEARCH; part of the Anglia Breast Cancer study) for analysis with a range of candidate markers. Immunohistochemistry is used to assess the level of nuclear ER expression. The focus of our initial pilot has been on the algorithm development and validation. For the miniPIPSS programme the work will move to increasing the utility of the analysis system by running all processing operations in a pipeline. This automation, the operational basis of our Pathgrid system, has been implemented in prototype by making use of the application-grid infrastructural components from AstroGrid This miniPIPSS project will facilitate the further interchange of ideas and technologies between the physical and medical sciences, and provide methods for the handling and processing of clinical image data in an open and extensible manner The interaction with Oracle represents the initial stage of a longer term partnership, aiming to further develop these analysis techniques such that they are available to he wider medical research community - and further ahead for use in a clinical environment.