Benchmarking for AI for Science at Exascale (BASE)
Lead Research Organisation:
Science and Technology Facilities Council
Department Name: Scientific Computing Department
Abstract
Advances in Artificial Intelligence (AI) and Machine Learning (ML) have enabled the scientific community to advance the frontiers of knowledge by learning from complex, large-scale experimental datasets. With the scientific community generating huge amounts of data from observatories to large-scale experimental facilities, AI for Science at Exascale is on the horizon.
However, in the absence of systematic approaches to evaluate AI models and AI algorithms at exascale, the AI for Science community, and, in fact, the general AI community, are facing a major barrier ahead.
This proposal aims to setup a working group with an overarching goal of identifying the scope and plans for developing AI benchmarks to enable the development of AI for Science at Exascale, in ExCALIBUR - Phase II.
Although AI Benchmarking is becoming a well-explored topic, a number of issues are still to be addressed, including, but not limited to:
a) There are no efforts aimed at AI benchmarking at exascale, particularly for science;
b) A range of scientific problems involving real-world large-scale scientific datasets, such as those from experimental facilities or observatories, are largely ignored in benchmarking; and
c) It is worth having benchmarks to serve as a catalogue of techniques offering template solutions to different types of scientific problems.
In this proposal, when scoping the development of an AI benchmark suite, we will aim to address these issues. In developing a vision, a scope and a plan for this significant challenge, the working group will not only engage with the community of scientists from a number of disciplines, and industry, but will also engineer a scalable and functional AI benchmark, so as to learn and embed the practical aspects of developing an AI benchmark into the vision, scope, and plan. The exemplary benchmark will focus on removing noise from images, which is a common issue across multiple disciplines including, life sciences, material sciences and astronomy. The specific problems from each of these disciplines are, removing noise from cryogenic electron microscopic (cryo-em) datasets, denoising X-Ray tomographic images, and minimising the noise from weak lensing images, respectively.
However, in the absence of systematic approaches to evaluate AI models and AI algorithms at exascale, the AI for Science community, and, in fact, the general AI community, are facing a major barrier ahead.
This proposal aims to setup a working group with an overarching goal of identifying the scope and plans for developing AI benchmarks to enable the development of AI for Science at Exascale, in ExCALIBUR - Phase II.
Although AI Benchmarking is becoming a well-explored topic, a number of issues are still to be addressed, including, but not limited to:
a) There are no efforts aimed at AI benchmarking at exascale, particularly for science;
b) A range of scientific problems involving real-world large-scale scientific datasets, such as those from experimental facilities or observatories, are largely ignored in benchmarking; and
c) It is worth having benchmarks to serve as a catalogue of techniques offering template solutions to different types of scientific problems.
In this proposal, when scoping the development of an AI benchmark suite, we will aim to address these issues. In developing a vision, a scope and a plan for this significant challenge, the working group will not only engage with the community of scientists from a number of disciplines, and industry, but will also engineer a scalable and functional AI benchmark, so as to learn and embed the practical aspects of developing an AI benchmark into the vision, scope, and plan. The exemplary benchmark will focus on removing noise from images, which is a common issue across multiple disciplines including, life sciences, material sciences and astronomy. The specific problems from each of these disciplines are, removing noise from cryogenic electron microscopic (cryo-em) datasets, denoising X-Ray tomographic images, and minimising the noise from weak lensing images, respectively.
Planned Impact
The core of our delivery is a scope and plan for the development of a suite of AI benchmarks in Phase II of the ExCALIBUR programme. When delivered, we see potential impacts across a number of different communities. We discuss these below in turn.
1. Scientists
There are two ways in which the outcomes of this proposal would impact the scientific community: one is enabling easier adoption of AI/ML methods. Secondly, enabling development of better, and scalable AI algorithms for a number of scientific problems.
With a catalogue of AI/ML methods available as template techniques for a range of representative problems from different disciplines, we expect that domain scientists can make use of these benchmarks to build better models or to solve similar problems. The benchmarks are aimed at exascale cases, and such, is likely to include models that can cope with very large datasets and to scale better with computational resources, so as to minimise the training time. This would speed up the way that AI or machine learning is adopted to different problems across scientific domains.
2. Hardware Manufacturers
With benchmarks being able to offer insights into the performance of AI software, these can either be attributed to software or hardware issues. Among these, hardware issues can be identified when profiled across multiple architectures. This will enable hardware manufacturers to develop improved hardware systems. It will also enable hardware-software co-design, particularly within the AI space with exascale in consideration.
3. Software / Service Providers
With benchmarks being available to solve certain types of problems in science, software companies or service providers, such as those who provide scientific software will be able to engineer better software than the ones available in benchmark. Furthermore, these benchmarks can also enable improved software for exascale systems, such as software profilers.
4. Large-scale System Users
Communities who procure large-scale systems, such as national laboratories or data centres, can use these benchmarks to decide the relative and absolute performance of different systems. It can also enable procurement of hardware resources for a specific user community, using relevant benchmarks within the suite.
5. The AI Community
The benchmarks that we are aiming to develop are based on real cases (opposed to being synthetic), and can be used across different aspects of AI, such as explainability, transfer learning, and other areas of AI research.
1. Scientists
There are two ways in which the outcomes of this proposal would impact the scientific community: one is enabling easier adoption of AI/ML methods. Secondly, enabling development of better, and scalable AI algorithms for a number of scientific problems.
With a catalogue of AI/ML methods available as template techniques for a range of representative problems from different disciplines, we expect that domain scientists can make use of these benchmarks to build better models or to solve similar problems. The benchmarks are aimed at exascale cases, and such, is likely to include models that can cope with very large datasets and to scale better with computational resources, so as to minimise the training time. This would speed up the way that AI or machine learning is adopted to different problems across scientific domains.
2. Hardware Manufacturers
With benchmarks being able to offer insights into the performance of AI software, these can either be attributed to software or hardware issues. Among these, hardware issues can be identified when profiled across multiple architectures. This will enable hardware manufacturers to develop improved hardware systems. It will also enable hardware-software co-design, particularly within the AI space with exascale in consideration.
3. Software / Service Providers
With benchmarks being available to solve certain types of problems in science, software companies or service providers, such as those who provide scientific software will be able to engineer better software than the ones available in benchmark. Furthermore, these benchmarks can also enable improved software for exascale systems, such as software profilers.
4. Large-scale System Users
Communities who procure large-scale systems, such as national laboratories or data centres, can use these benchmarks to decide the relative and absolute performance of different systems. It can also enable procurement of hardware resources for a specific user community, using relevant benchmarks within the suite.
5. The AI Community
The benchmarks that we are aiming to develop are based on real cases (opposed to being synthetic), and can be used across different aspects of AI, such as explainability, transfer learning, and other areas of AI research.
Organisations
- Science and Technology Facilities Council (Lead Research Organisation)
- Oak Ridge National Laboratory (Collaboration)
- IBM (Collaboration)
- Cerebras Systems Inc. (Collaboration)
- Graphcore (Collaboration)
- The Mathworks Ltd (Collaboration)
- Argonne National Laboratory (Collaboration)
- Intel (United States) (Collaboration)
- European Spallation Source (Collaboration)
- University of California, Berkeley (Collaboration)
- DataDirect Networks (Collaboration)
- Science and Technologies Facilities Council (STFC) (Collaboration)
- NVIDIA (Collaboration)
- Boston Ltd (Collaboration, Project Partner)
- Shanghai Synchrotron Radiation Facility (SSRF) (Collaboration)
- University of Washington (Collaboration)
- NVIDIA Limited (UK) (Project Partner)
- MathWorks (United Kingdom) (Project Partner)
- University of Leicester (Project Partner)
- DDN (DataDirect Network) (International) (Project Partner)
- IBM Hursley (Project Partner)
- Cerebras Systems (Project Partner)
Publications
Choudhary A
(2023)
Artificial Intelligence for Science - A Deep Learning Revolution
Choudhary K
(2023)
Unified graph neural network force-field for the periodic table: solid state applications
in Digital Discovery
Henghes B
(2022)
Deep learning methods for obtaining photometric redshift estimations from images
in Monthly Notices of the Royal Astronomical Society
Henghes B
(2021)
Benchmarking and scalability of machine-learning methods for photometric redshift estimation
in Monthly Notices of the Royal Astronomical Society
Lucie-Smith L
(2020)
Deep learning insights into cosmological structure formation
Lucie-Smith L
(2022)
Discovering the building blocks of dark matter halo density profiles with neural networks
in Physical Review D
Description | Since the programme been funded, we have established the significance of Science-specific AI benchmarks not only across the science community, but also across industry and machine learning communities. Unlike industrial challenges, science-specific machine learning solutions challenge a number of cross-cutting aspects, including systems, AI models and algorithms and possibility of advancing science through AI. The on-going efforts with MLCommons Science Working Group, and plans for submitting the outcomes of this programme as potential MLCommons Science benchmarks to a wider target group is a significant achievement. |
Exploitation Route | The final benchmark suite, delivered as part of this programme is likely to aid both the industry and scientific community in multiple different ways: 1. The suite will provide a means for comparing systems 2. the suite will enable advancing science through AI/ML 3. Will serve as a wonderful teaching tool 4. The RSE community can benefit from the suite by incorporating them as part of the testing/verification and continuous integration pipeline. 5. The benchmarks matter to nearly all scientific domains, from material sciences to astronomy to life sciences, where it provides key algorithms for representative problems from these domains. |
Sectors | Chemicals,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology |
URL | https://github.com/stfc-sciml/sciml-bench |
Description | The notion of AI benchmarking has become more acceptable than it was few years ago, especially at the start of this programme. With community-linked activities, industries are starting to understand the significance of this approach, and becoming a mechanism for procurement where various parties are trying to establish the suitability of systems for various AI tasks using benchmarks. In particular, science-based AI benchmarks is often sought by industries. |
First Year Of Impact | 2022 |
Impact Types | Economic |
Description | Blueprinting AI for Science at Exascale - Phase II (BASE-II) |
Amount | £750,713 (GBP) |
Funding ID | EP/X019918/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 12/2022 |
End | 11/2024 |
Title | AI Benchmarking Framework and Benchmarks |
Description | This research tool helps in assessing various machine learning techniques in addressing a given scientific problem with a coherent manner. The research tool is supplied with a suite of example, but real-world scientific problems, with corresponding baseline solutions (implemented as ML-based solutions). This enables novel methods to be benchmarked against the baseline, or similar benchmarks to be established. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | This tool serves as a fundamental vehicle for developing novel AI techniques for a range of problems that large-scale experimental facilities within the Rutherford Appleton Laboratory. This has also paved a way for establishing very meaningful collaborations with the US laboratories and with industries. |
URL | https://github.com/stfc-sciml/sciml-bench |
Description | Collaboration with Argonne National Laboratories (AI Benchmarking) |
Organisation | Argonne National Laboratory |
Country | United States |
Sector | Public |
PI Contribution | The overall collaboration is for advancing the AI Benchmarking Agenda along with scaling and exascale agenda. As such, for this collaboration, the Scientific Machine Learning Research Group contributed * by bringing knowledge and expertise around AI benchmarking * Working with ANL in identifying a scientific case for benchmark, and * Identifying datasets to be used for the benchmarking purposes. |
Collaborator Contribution | Their contributions included, discussion of various scientific cases, access to systems, methodologies for engineering solutions. |
Impact | New upcoming benchmarks, additional datasets, and new modified framework for benchmarking. |
Start Year | 2021 |
Description | Collaboration with Argonne National Laboratory |
Organisation | Argonne National Laboratory |
Country | United States |
Sector | Public |
PI Contribution | The BASE program aims to develop an AI benchmarking suite for exascale systems, covering multiple domains of sciences. In this, AI or Science is a core area of focus, and towards this collaboration, we brought a prototype benchmark suite and potential benchmark cases that would be of interest to Argonne's AI for Science Programme. |
Collaborator Contribution | They have contributed by providing an expert advice to our programme, particularly by providing a potential benchmark member (CANDLE). They also took part in discussions and workshops organised as part of this programme. |
Impact | Additional collaborations (such as Instrument-to-edge programme at RAL). Multi-disciplinary: AI for Science, and Machine Learning |
Start Year | 2020 |
Description | Collaboration with Boston Systems |
Organisation | Boston Ltd |
Country | United Kingdom |
Sector | Private |
PI Contribution | One of the key agendas of this grant it to develop science-specific AI benchmarks for new systems. As such, we have developed a benchmark suite that would benefit both the open science community and industries. In this instance, the benchmarks are useful for BOSTON Limited for evaluating and comparing various integrated systems. |
Collaborator Contribution | Boston Limited has contributed to this research by providing * Consulting our development team by discussing typical needs and requirements they will be seeking from benchmarks * Enabling us to use newly integrated systems to run benchmarks on * Advising us how to fine-tune benchmarks * Discussing benchmark results * Taking part in our workshops * Providing technical steer wherever possible * Providing some of the system-specific training for our development team, and * Providing support on systems that are made available for evaluation. |
Impact | * Preliminary benchmark results. |
Start Year | 2020 |
Description | Collaboration with Cerebras |
Organisation | Cerebras Systems Inc. |
Country | United States |
Sector | Private |
PI Contribution | One of the key agendas within the Scientific Machine Learning group is AI benchmarking. We key contributions for this ongoing partnership are: * Framework for performing AI benchmarking * Datasets for AI benchmarking * Domain-specific problems for AI Benchmarking * Performance evaluation of benchmarks on an AI-specific architecture from Cerebras (CS-1) |
Collaborator Contribution | In developing the overall benchmarking programme, Cerebras has provided a number of contributions: * Staff timing for discussing relevant benchmarks * Access to compute time on their state-of-the-art CS-1 system * Staff timing for discussing architectural aspects of their systems. |
Impact | This collaboration is multi-disciplinary, involving machine learning, material sciences, particle physics, environmental sciences and life sciences. This is an ongoing partnership / collaboration, and has resulted in a strong support towards an EPSRC grant application (ExCALIBUR phase I), and further developments on benchmarks. |
Start Year | 2020 |
Description | Collaboration with DDN Research Group |
Organisation | DataDirect Networks |
Country | United States |
Sector | Private |
PI Contribution | The benchmarking for AI for Science for Exascale (BASE) includes a particular theme of benchmarking called end-to-end benchmarking. Here the idea is to test various storage testbeds on different AI workloads. Along this line, we have developed a benchmark suite that would benefit both the open science community and industries. In this instance, some of the benchmarks are IO intensive, and are useful to the DDN Research Group, who is one of the leading storage systems providers, for evaluating and comparing various storage systems. |
Collaborator Contribution | * Consulting our development team by discussing what are ideal benchmarks for testing storage / IO systems. * Advising us on benchmark metrics for storage systems * Taking part in our workshops * Providing technical steer wherever possible, particularly towards systems we can test * Providing a storage system for trial purposes (which didn't take place with Covid-19). |
Impact | The evaluations and corresponding outputs have been affected by Covid-19. Please see Covid-19 impact, |
Start Year | 2020 |
Description | Collaboration with DiRAC |
Organisation | Science and Technologies Facilities Council (STFC) |
Department | Distributed Research Utilising Advanced Computing |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | This BASE program aims to develop AI benchmarks for science cases. In this collaboration, we were able to make the following contributions * A training event on ML for Science to the Dirac community * Discussion around our benchmarking efforts * Identification of potential AI benchmarking candidates. |
Collaborator Contribution | * Taking part in our community meetings and workshops * Co-Organising training events for the AI for Science community * Consultancy on benchmark candidates * Inviting us to use their systems. |
Impact | Initial prototype benchmarks. Multidisciplinary: Astrophysics, and machine learning.This |
Start Year | 2020 |
Description | Collaboration with DiRAC (Federation Project) |
Organisation | Science and Technologies Facilities Council (STFC) |
Department | Distributed Research Utilising Advanced Computing |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Following various training programmes offered by the Scientific Machine Learning Research Group, DiRAC has invited SciML to collaborate with DiRAC in realising their Federation project where SciML will provide a number of AI/ML-specific training materials to DiRAC, which they will use in their training programmes. |
Collaborator Contribution | Domain expertise and access to systems. |
Impact | * 15 different science cases a Jupyter notebooks demonstrating AI for Science. |
Start Year | 2021 |
Description | Collaboration with DiRAC for AI Benchmarking |
Organisation | Science and Technologies Facilities Council (STFC) |
Department | Distributed Research Utilising Advanced Computing |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The purpose of this collaboration is to advance AI Benchmarking. This collaboration provided an avenue for developing and evaluating benchmark on DiRAC systems, while receiving benchmark cases from DiRAC. As such, the contributions from Scientific Machine Learning Research Group includes * Expertise around AI benchmarking * Framework development * Development and evaluation of cases |
Collaborator Contribution | * Access to DiRAC systems (e.g. CSD3) * Access to various scientific cases * Additional collaborations |
Impact | * Additional benchmarks cases * Improvement of the benchmark framework. |
Start Year | 2021 |
Description | Collaboration with DiRAC for AI Benchmarking |
Organisation | Science and Technologies Facilities Council (STFC) |
Department | Distributed Research Utilising Advanced Computing |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The purpose of this collaboration is to advance AI Benchmarking. This collaboration provided an avenue for developing and evaluating benchmark on DiRAC systems, while receiving benchmark cases from DiRAC. As such, the contributions from Scientific Machine Learning Research Group includes * Expertise around AI benchmarking * Framework development * Development and evaluation of cases |
Collaborator Contribution | * Access to DiRAC systems (e.g. CSD3) * Access to various scientific cases * Additional collaborations |
Impact | * Additional benchmarks cases * Improvement of the benchmark framework. |
Start Year | 2021 |
Description | Collaboration with European Spallation Source |
Organisation | European Spallation Source |
Country | Sweden |
Sector | Public |
PI Contribution | The BASE program aims to develop an AI benchmarking suite for exascale systems, covering multiple domains of sciences. In this, AI or Neutron and Photon science are core areas of focus, and towards this collaboration, we brought a prototype benchmark suite and potential benchmark cases (such as EM-Noise) that would be of interest to ESS. |
Collaborator Contribution | The ESS has contributed to this collaboration by engaging in a number of discussions around benchmarking, exploiting cloud for data and science and discussing potential benchmark candidates. |
Impact | Multi-disciplinary: AI/ML, Photon Science, Neutron Science. |
Start Year | 2020 |
Description | Collaboration with GraphCore (AI Benchmarking) |
Organisation | Graphcore |
Country | United Kingdom |
Sector | Private |
PI Contribution | The purpose of this collaboration is to advance AI Benchmarking. This collaboration provided an avenue for testing / evaluating our benchmark suite on modern Graphcore architectural platforms. As such, the contributions from Scientific Machine Learning Research Group includes * Expertise around AI benchmarking * Framework development * Development and evaluation of cases. |
Collaborator Contribution | Access to Graphcore systems, expert support and advice around developmental efforts. |
Impact | New stream of benchmarks to be released in 2022. |
Start Year | 2021 |
Description | Collaboration with IBM Research UK |
Organisation | IBM |
Department | IBM Research in the UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | One of the key agendas of the Benchmarking for AI for Science at Exascale (BASE) programme is developing science-specific AI benchmarks for future systems. As such, we have developed a benchmark suite that would benefit both the open science community and industries. In this instance, the benchmarks are useful for IBM for evaluating and comparing various AI systems and algorithms. |
Collaborator Contribution | Over the course of the collaboration, IBM Research has contributed to our partnership through * Regular meetings with various IBM groups to discuss AI for Science aspects * Working with Boston Systems to provide access to some of the IBM systems for benchmark evaluation * Advising us on system or vendor-specific tunings that can be done on benchmarks * Discussing benchmark results * Taking part in our workshops * Providing technical steer wherever possible and * Involving us n additional collaborations. |
Impact | Some initial results on our prototype benchmark suite. This collaboration is multi-disciplinary covering systems, engineering, computer science, AI or machine learning. |
Start Year | 2020 |
Description | Collaboration with Intel on End to End Benchmarking |
Organisation | Intel Corporation |
Country | United States |
Sector | Private |
PI Contribution | This partnership is on building two end-to-end AI benchmarks and performance evaluation of those on different architectural platforms. The SciML team is: * Building two end-to-end scientific benchmarks (covering two different scientific domains, namely environmental sciences and life sciences) * Performance evaluation of those benchmarks on at least two different computing platforms, and * Consolidating two large, scientific datasets that would facilitate AI benchmarking. |
Collaborator Contribution | Other than the financial contribution to carry out the research, the following contributions are being made: * Access to staff time from Intel for shaping the benchmarks, performance evaluation, and interpretation of the results. |
Impact | It is a multi-disciplinary collaboration - covering computing, machine learning, life sciences, and environmental sciences. The resulting outcomes are: * Two end-to-end benchmarks, which will be opened to the wider community * Scientific insights into the performance of benchmarks * Wider influence on the overall benchmarking programme within the SciML group. |
Start Year | 2020 |
Description | Collaboration with MathWorks Ltd |
Organisation | The Mathworks Ltd |
Country | United Kingdom |
Sector | Private |
PI Contribution | One of the key agendas of the BASE programme is developing AI benchmarks solving scientific problems. These benchmarks are novel (covering implementations and datasets). For this partnership, we provided some initial benchmarks, and potential ways of mapping them on MATLAB |
Collaborator Contribution | * Taking part in our community events, such as workshops organised by this working group. * Consulting our development team by discussing typical needs and requirements they will be seeking from benchmarks * Providing a dedicated person to discuss potential benchmarks and challenges around distributing them to the MATLAB community and * Providing technical steer wherever possible |
Impact | Some initial benchmarks. |
Start Year | 2020 |
Description | Collaboration with NVIDIA |
Organisation | NVIDIA |
Country | Global |
Sector | Private |
PI Contribution | The BASE programme was aimed at developing a number of science-specific AI benchmarks. As such, we have developed a benchmark suite that would benefit both the open science community and industries. In this instance, the benchmarks are useful for NVIDIA for assessing the performance of various generations of GPUs, and other software tools they provide. |
Collaborator Contribution | * Regular meetings with us to discuss the progress of our work * Advise and consultancy on various challenges around benchmarking AI solutions on GPUs * Consulting our development team by discussing typical needs and requirements they will be seeking from benchmarks * Advising us how to fine-tune benchmarks * Discussing benchmark results * Taking part in our workshops, and * Providing technical steer wherever possible. |
Impact | Initial prototype benchmark suite, and some results across various systems. |
Start Year | 2020 |
Description | Collaboration with Shanghai Synchrotron Radiation Facility |
Organisation | Shanghai Synchrotron Radiation Facility (SSRF) |
Country | China |
Sector | Academic/University |
PI Contribution | The BASE program aims to develop an AI benchmarking suite for exascale systems, covering multiple domains of sciences. In this case, we considered material sciences as one of the areas to focus. To this collaboration, we brought a prototype benchmark suite and potential benchmark cases from material sciences, particularly towards experimental facilities life SSRF. |
Collaborator Contribution | They have contributed by providing an expert advice to our programme. Took part in our discussions and workshops we organised. |
Impact | Prototype Benchmark Suite. Multidisciplinary: Material Sciences |
Start Year | 2020 |
Description | Collaboration with University of California, Berkley |
Organisation | University of California, Berkeley |
Country | United States |
Sector | Academic/University |
PI Contribution | The BASE program aims to develop an AI benchmarking suite for exascale systems, covering multiple domains of sciences. In this, AI or Science is a core area of focus, and towards this collaboration, we brought a prototype benchmark suite and potential benchmark cases that are similar to the CAMERA project carried out at Berkeley. |
Collaborator Contribution | They have contributed by providing an expert advice to our programme. Took part in our discussions and workshops we organised. |
Impact | Prototype benchmark suite and additional collaborations. This a multi-disciplinary collaboration touching AI for Science, Mathematics and Machine Learning. |
Start Year | 2020 |
Description | Collaboration with University of Washington |
Organisation | University of Washington |
Country | United States |
Sector | Academic/University |
PI Contribution | The BASE program aims to develop an AI benchmarking suite for exascale systems, covering multiple domains of sciences. In this case, we considered astronomy as one of the areas to focus. To this collaboration, we brought a prototype benchmark suite and potential benchmark cases from Astronomy. |
Collaborator Contribution | They have contributed by providing an expert advice on our programme. Took part in our discussions and workshops we organised. |
Impact | Multidisciplinary: AI/ML and Astronomy. |
Start Year | 2020 |
Description | Oak Ridge National Laboratory (ORNL) |
Organisation | Oak Ridge National Laboratory |
Country | United States |
Sector | Public |
PI Contribution | The overall collaboration is for advancing the AI Benchmarking Agenda. As such, for this collaboration, the Scientific Machine Learning Research Group contributed * by bringing knowledge and expertise around AI benchmarking * Working with ORNL in identifying a scientific case as a benchmark * Identifying a dataset to be used for the benchmark. |
Collaborator Contribution | * Domain expertise * Scientific code for the benchmark * Simulated dataset |
Impact | * The SEMDL benchmark created as part of this collaboration will be embedded as part of the next SciML-Bench release 1.5.0 in 2022 |
Start Year | 2021 |
Title | SciML-Bench (Revision) |
Description | This is an extension or improved version of the original SciML-Bench Software created under the Benchmarking for AI for Science agenda (covering exascale aspects). The software includes a) suite of scientific applications, b) relevant datasets (hosted by STFC), and c) a framework that can run these benchmarks, and d) containerised recipes. |
Type Of Technology | Webtool/Application |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | This release has brought considerable amount of collaborations from various public sector research establishments (PSREs), industries, various national laboratories, and academic institutions. |
URL | https://github.com/stfc-sciml/sciml-bench/ |
Description | 'DES dark matter map' on BBC1 TV News at Six and Ten, BBC online, Guardian first page, Nature report (May 2021) |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | The mass map from the Dark Energy Survey resulted from the PhD work of Niall Jeffrey at UCL (supervised by Ofer Lahav). It appeared in May 2021 on BBC1 TV News at Six and Ten, BBC online, Guardian first page, and in a Nature report. |
Year(s) Of Engagement Activity | 2021 |
Description | Benchmarking for AI for Science |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | This was a poster and a talk offered at the UKRI-BBSRC event titled AI in Bio Sciences. The event was organised by UKRI covering academia, national laboratories and industries, and aimed at providing a themed topics around AI for Science, but targetted at Bio Sciences community. The event has generated substantial amount of interest on use of various ML / AI techniques, and their suitability on various bio sciences problem, a common problem across all other domains. The talk was offered by Tony Hey on the 19th of May, 2021. |
Year(s) Of Engagement Activity | 2021 |
Description | Engagement with the IRIS Community |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This was a presentation made to the IRIS Machine Learning Workshop, attended by 25+ people, who are leading scientists and academics working on Science using or aiming to utilise machine learning or AI as part of their work. The talk has resulted in a series of discussions, and further collaborations. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.c2d3.cam.ac.uk/events/iris-machine-learning-workshop |
Description | Engagement with the MLCommons Science Working Group |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The AI Benchmarking initiatives were includes as part of the MLCommons Science Working Group (which is part of MLCommon initiatives), which has a broader, international level outreach. The engagement include various presentations in meetings, submission or curation of new datasets or benchmarks from the benchmark suite created as part of the the work AI benchmarking initiatives (covering scaling and exascale and AI for Science aspects), and more importantly, collaborating with various other national and international laboratories. |
Year(s) Of Engagement Activity | 2021,2022 |
URL | https://mlcommons.org/en/groups/research-science/ |
Description | Talk at the CCDSC Workshop 2022 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | There were two invited talk for the Workshop on Clusters, Clouds, and Data for Scientific Computing (CCDSC) 2022. These two talks were: Tony Hey, The SKA Project and FAIR Data Jeyan Thiyagalingam, Four Years of AI for Science at the Rutherford Appleton Laboratory. The first talk captured the notion of FAIR data on the SKA project with AI for Science focus, while the latter provided a brief outline of the AI for Science initiatives at the Rutherford Appleton Laboratory. |
Year(s) Of Engagement Activity | 2022 |
URL | https://tiny.utk.edu/ccdsc-2022 |
Description | Talk at the CERFACS / N7, France |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Industry/Business |
Results and Impact | In this talk, the overall experience of AI for Science initiatives at the Rutherford Appleton Laboratory was presented among professors, industrial practitioners, and researchers at CERFACS, France, which is a industry-focussed research organisation formed by the University (N7). There were around 70 participants, covering university academics, industry researchers and postdoctoral researchers. |
Year(s) Of Engagement Activity | 2022 |
Description | Talk at the Eu-XFEL, titled "AI for Science at the Rutherford Appleton Laboratory : Past, Present and Future" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This was an invited talk at the Eu-XFEL (European X-Ray Free-Electron Laser Facility), staged as a hybrid event, attracting well over 120 participants across various labs working on this area. There were a large number of requests for collaborations, and requests for additional visits. |
Year(s) Of Engagement Activity | 2023 |
Description | Talk at the National Institute for Standards and Technology (US) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This was a talk delivered to National Institute of Standards and Technology (NIST) https://www.nist.gov/ on Benchmarking for Science. This was an invited talk, and attended by several divisions within NIST. The talk has resulted in further collaborations and on-going activities (such as converting their Genome In a Bottle as a benchmark). |
Year(s) Of Engagement Activity | 2020 |
Description | Talk at the Natural History Museum, London on "Accelerating the pace of AI at NHM through lessons and expertise from AI for Science at the Rutherford Appleton Laboratory" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | This was an invited talk for the Natural History Museum, London, on "Accelerating the pace of AI at NHM through lessons and expertise from AI for Science at the Rutherford Appleton Laboratory", with the aim of fostering collaborations between the Rutherford Appleton Laboratory and NHM. There were around 30 participants, based from Turing and NHM. This has resulted in further discussions for fostering collaborations. |
Year(s) Of Engagement Activity | 2022 |