Cambridge Astronomical Survey Unit (CASU): Filling the Astronomical Data Lake (2020-2024)

Lead Research Organisation: University of Cambridge
Department Name: Institute of Astronomy


Observational survey astronomy is powering discovery, with the UK leading and benefiting from its investment, both financially and intellectually in state-of-the-art observational facilities. The last decade has seen the delivery of comprehensive imaging of both the northern and southern hemispheres in the near infrared using WFCAM and VISTA, with the UK leading these survey initiatives. These are providing key insights across a wide range of astrophysics. The next decade sees the arrival of large scale spectroscopic surveys to probe key populations (be they galaxies, stars, asteroids) revealed by the imaging. The UK is well poised to lead discovery, through involvement and definition of the specific surveys to be carried out on these facilities such as WEAVE, 4MOST and MOONS.

The 2020's has the potential to be the decade of "total" survey astronomy enabling profound insights into astrophysics at all scales through the combination of comprehensive imaging and spectroscopic scale data. Ensuring the scientific potential of these facilities and surveys requires the availability of expertise and systems to optimally extract information from the data. Here the UK has a substantial lead with the Cambridge Astronomical Survey Unit (CASU) and its proven ability to provide cost effective data systems to the UK and wider communities.

CASU have been at the forefront of survey astronomy, both pioneering techniques to optimally extract knowledge from survey data, and also in taking a proactive role in exploiting this information to produce world-leading research. This synergy and feedback between data processing and science delivery has been repeatedly demonstrated to be essential in ensuring delivery of the best possible science data products for exploitation by the widest community of UK and European astronomers. In the last decade,

CASU generated science data products from VISTA, WFCAM and VST imaging and Gaia-ESO VLT spectroscopy have supported world-class research programmes across almost every UK institute involved in astrophysics. CASU are filling the astronomical data lake, the vital data resource which the community are able to mine, combine with other multi-wavelength data (e.g. Euclid, PLATO) and discover rare and unique objects for further detailed study by facilities such as the ELT or the JWST.

The role of CASU has been acknowledged in the wider context, with ESO relying on CASU to provide the science data products from its wide range of public surveys currently running on the ESO survey facilities

The CASU design philosophy is to allow the evolution of an optimal ergonomic solution to this avalanche of data, through access to Petabyte scale data storage systems and expert pipeline processing systems. Continued development, maintenance and operation of the CASU processing and analysis pipelines will ensure that the UK community is well positioned to rapidly scientifically exploit the data from these key new survey facilities. In addition CASU expertise will be essential in meeting the challenges inherent in the application of Machine Learning assisted discovery in these data and provide a potential resource for the UK when developing the UK-LSST partnership.

This grant proposal builds on the tremendous advances already made by CASU and requests funding for the period 2020-2024 for the following activities:
- Design and development of the data management and analysis systems for the next generation of ESO / ING wide field massively multiplexed optical spectrographs WEAVE, MOONS and 4MOST;
- Further development, maintenance, operation and user support of the science and analysis pipelines for the UK-led VISTA large scale surveys;
- Operational support, pipeline processing and further development of the science and analysis pipelines for the UK-led ESO VST public surveys;
- Deployment of advanced data interfaces to the CASU science data, enabling machine learning assisted mining of the data.

Planned Impact

The University of Cambridge has one of the most successful programmes for encouraging knowledge transfer and resulting societal impact between University departments and industry both in the United Kingdom and elsewhere. CASU's approach to the search for impact opportunities has been guided by the mechanisms the University has in place to
facilitate this.

CASU continues to be involved in the transfer of image analysis and data handling systems to the medical domain, and in particular image processing applied to oncology. This exchange has significant potential to both increase the effectiveness of clinical health care and enhance the quality of life of those with cancer, through improved outcomes through better targeted therapeutic treatments.

The partnership of CASU staff with the University of Cambridge's Department of Oncology and Cambridge Institute, Cancer Research UK, has continued, with involvement in the Cancer Research UK Grand Challenge initiative. The IMAXT (Imaging and Molecular Annotation of Xenografts and Tumors) project is a £20M Cancer Research UK Grand Challenge project, led from the Cambridge Institute. The IoA participates in this (Walton as co-I leading the IMAXT data analysis system development), with CASU expertise being leveraged to develop and deploy the image analysis and data handling system required to generate the segmented, registered, image catalogues for the range of cutting edge imaging technologies employed in the project. This includes Serial Two Photon Tomography, Imaging Mass Cytology and MERFISH data. In combination these allow annotated maps of cancer tumours in 3-D at the sub cellular level, where the gene and protein makeup of all cells are described. Linked to large scale breast cancer trials, this is enabling a better understanding of disease and treatment pathways. The CASU expertise is in the transfer of image analysis techniques to the medical data, and the associated methods in handling and combining the multi-modal data (for instance challenges in registering the data sets at the micron level). Within IMAXT, a significant processing infrastructure has been deployed, along with an associated science platform.

Locally CASU is providing ad-hoc data expertise to the STFC Cambridge Centre for Doctoral Training (CDT) PhD programme, and this will be broadened more widely to the other STFC PhD CDT networks, especially when access and manipulation to the science data is simplified with the deployment of the CASU Science Data Access Hub as noted in Section 10.4.3.

As noted in the Pathways to Impact plan, CASU provides material to support the wider IoA Outreach programme, including the new IoA/KICC outreach programme aiming to increase STEM subject take up in schools across Cambridgeshire, Norfolk, Suffolk, and Peterborough .CASU is active in supporting the public understanding of science activities undertaken more generally at the IoA and the university of Cambridge. CASU provides a range of high quality processed images of the sky which are used as high impact
visual material in outreach activities such as the successful series of one-day conferences for schools, each day in turn targeting KS2, KS3, KS4, KS5 and secondary school teachers.

Further details are contained within the Pathways to Impact document.


10 25 50

publication icon
Abbott T (2021) The Dark Energy Survey Data Release 2 in The Astrophysical Journal Supplement Series

publication icon
Aguado D (2020) The S2 Stream: the shreds of a primitive dwarf galaxy.* in Monthly Notices of the Royal Astronomical Society

publication icon
Akhazhanov A (2022) Finding quadruply imaged quasars with machine learning - I. Methods in Monthly Notices of the Royal Astronomical Society

publication icon
Andrade-Oliveira F (2021) Galaxy clustering in harmonic space from the dark energy survey year 1 data: compatibility with real-space results in Monthly Notices of the Royal Astronomical Society

publication icon
Ansarinejad B (2022) The nature of sub-millimetre galaxies II: an ALMA comparison of SMG dust heating mechanisms in Monthly Notices of the Royal Astronomical Society

publication icon
Barros S (2022) Detection of the tidal deformation of WASP-103b at 3 s with CHEOPS in Astronomy & Astrophysics

publication icon
Benz W (2020) The CHEOPS mission in Experimental Astronomy

publication icon
Bernardinelli P (2022) A Search of the Full Six Years of the Dark Energy Survey for Outer Solar System Objects in The Astrophysical Journal Supplement Series

publication icon
Bernardinelli P (2021) C/2014 UN 271 (Bernardinelli-Bernstein): The Nearly Spherical Cow of Comets in The Astrophysical Journal Letters

publication icon
Binks A (2021) The Gaia -ESO survey: a lithium depletion boundary age for NGC 2232 in Monthly Notices of the Royal Astronomical Society

publication icon
Blomme R (2022) The Gaia -ESO Survey: The analysis of the hot-star spectra in Astronomy & Astrophysics

publication icon
Borsato L (2021) Exploiting timing capabilities of the CHEOPS mission with warm-Jupiter planets in Monthly Notices of the Royal Astronomical Society

publication icon
Bourrier V (2022) A CHEOPS-enhanced view of the HD 3167 system, in Astronomy & Astrophysics

publication icon
Bragaglia A (2022) The Gaia -ESO Survey: Target selection of open cluster stars in Astronomy & Astrophysics

publication icon
Brandeker A (2022) CHEOPS geometric albedo of the hot Jupiter HD 209458 b in Astronomy & Astrophysics

publication icon
Cantu S (2021) A Deeper Look at DES Dwarf Galaxy Candidates: Grus i and Indus ii in The Astrophysical Journal

Description WEAVE 
Organisation Isaac Newton Group of Telescopes (ING)
Country Spain 
Sector Academic/University 
PI Contribution WEAVE is a new instrument under development and construction for the ING's William Herschel Telescope. We are responsible for the development of the Data Analysis system for the instrument - and also participate in the WEAVE Science Team.
Collaborator Contribution The page at gives a list of the partners in the WEAVE consortium.
Impact The consortium is responsible for the development of the WEAVE spectrograph. This will be commissioned on the ING's WHT late 2017.
Start Year 2010
Title CASUtools 
Description CASUtools 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact CASUtools are elements of the image analysis pipeline software used by CASU