Astronomical Image Handling - Application to TMA Analysis: Exploiting AstroGrid/ CancerGrid Systems

Lead Research Organisation: University of Cambridge
Department Name: Institute of Astronomy

Abstract

Cancers result from the accumulation of genetic changes, with changes
in the genome of the cancer cells disrupting normal cellular
functions. Tracking the causes, and predicting the effects and impact
of treatments is a complex problem, involving the the integration and
analysis of multi-dimensional datasets. Large clinical studies,
involving many thousands of patients) are now underway in the UK aimed
at providing vital evidence to support the investigation of the causes
of a significant range of cancers. A key challenge is providing an
interface between for instance epidemiological trials data and high
volume data from tissue micro-arrays (used to for instance identify
markers for breast cancer classification).

The CancerGrid project is providing an e-infrastructure to to support
e-clinical trials. AstroGrid has developed a framework to support the
the discovery and analysis of large distributed astronomical images
using the latest developments in distributed computation. The
scientific data itself is generated by sophisticated image processing
pipelines acting on the raw observational data.

The aim of this programme is explore novel uses of techniques
developed in the astronomical domain to provide solutions for the
challenges inherent in the analysis and integration of the large
medical imagery. This will include use of image analysis algorithms
optimised for feature classification, integration of statistical
applications in a workflow environment, and interface of the resulting
richly annotated results to the medical CancerGrid created information
infrastructure.

Outcomes from this programme will be designs and prototypes of new
information systems to support the handling of these large distributed
data flows, and increased collaborative pathways between these key
medical and astronomical groups.

Technical Summary

CancerGrid is a key MRC eScience programme that is developing open
standards for clinical cancer informatics by developing a
model-driven, document-centric architecture to manage and use the
complex data gathered in cancer clinical trials so that this can be
used for linked translational tissue-based research. Tissue
microarrays (TMA) from approximately 10,000 breast cancer samples from
four clinical studies are being constructed and will be linked to
their associated clinical data using a set of interoperable Common
Data Elements (CDEs) to represent concepts and measurements from the
trial. Analysis of this extensive image and clinical data is being
carried out to identify markers for breast cancer classification and
outcome. AstroGrid (http://www.astrogrid.org) has developed a Virtual
Observatory (VO) infrastructure, based on open standards, which
provides a data discovery and analysis environment for
astronomers. The Cambridge AstroGrid Survey Unit (CASU)
(http://casu.ast.cam.ac.uk) has developed a sophisticated image
analysis pipeline system.

This programme will bring together key intellectual leaders of these
two major eScience projects located in Cambridge. It will enable the
investigation of novel techniques to be transferred to provide
solutions to a number of key needs, supporting the integration of
digital imaging data with clinical trials information. These
techniques include the application of astronomical image analysis
algorithms optimised for feature recognition, the application of
statistical environments such as R, through an integrated workflow
system to enable large scale statistical analysis of resulting
annotated object catalogues, use of distributed e-science
infrastructural frameworks to facilitate multi-TB image data flows.

In the longer term improved collaborative work between the larger
multi-institute groups associated with these projects will result,
thereby leveraging the value of capabilities developed in the PPARC
eScience to the benefit of the CancerGrid and related
projects. Techniques investigated during this programme will be further
developed to form the basis of significant new information handling
systems supporting research programmes such determining markers for
breast cancer.

Publications

10 25 50