PSDI Phase 1b
Lead Research Organisation:
STFC - LABORATORIES
Department Name: Scientific Computing Department
Abstract
PSDI is a key enabler of Digital Chemistry and Materials Discovery, providing a platform to underpin the role of digital technologies and AI in enabling discovery across the Physical Sciences, and linking to data infrastructures in other domains. Through PSDI, researchers will be able to leverage the combination of the transformative potential of digital technologies with molecular and materials science principles to support their everyday working practice whilst at the same time accelerating discovery and innovation. PSDI will help drive the missions to achieve a Net Zero Chemicals Sector by 2041; reimagine materials discovery to accelerate technologies for Net Zero; and optimise drug discovery beyond small molecules.
Today, each physical science research infrastructure, from individual laboratories to large facilities, has essentially its own isolated data ecosystem which are often bespoke with varying degrees of management. In contrast, many other domains have data-centric infrastructures for collecting and reusing data which act as community hubs and drivers of new methods and discoveries. There is a clear need within physical sciences for an additional infrastructure layer to enable researchers to acquire, analyse and share their data in addition to searching, using and aggregating a wide range of existing resources whilst ensuring that each dataset can remain dedicated to its specific application.
There is a need to preserve and exploit outputs from past research while keeping pace with the increasing rate of data generation, the latter posing the greatest challenge and potential for innovation. New chemicals, materials and devices are key to a sustainable future, both environmentally and financially. The UK needs to invent its way out of seemingly conflicting targets of maintaining economic growth whilst making unprecedented strides towards an imminent net zero carbon output and PSDI will be the enabling vehicle for this approach.
This phase of PSDI builds on the results of the PSDI pilot project which ran from November 2021 to March 2022 (www.psdi.ac.uk). In this second phase of PSDI, we will begin to implement the recommendations that were developed during the PSDI pilot phase. We will commence development of the PSDI "Hub" and a number of "Pathfinders" which will seed the population of the Hub. This Phase will also continue community engagement activities and initialise a community governance mechanism for PSDI, as well as exploring and evaluating possible future pathfinders.
Today, each physical science research infrastructure, from individual laboratories to large facilities, has essentially its own isolated data ecosystem which are often bespoke with varying degrees of management. In contrast, many other domains have data-centric infrastructures for collecting and reusing data which act as community hubs and drivers of new methods and discoveries. There is a clear need within physical sciences for an additional infrastructure layer to enable researchers to acquire, analyse and share their data in addition to searching, using and aggregating a wide range of existing resources whilst ensuring that each dataset can remain dedicated to its specific application.
There is a need to preserve and exploit outputs from past research while keeping pace with the increasing rate of data generation, the latter posing the greatest challenge and potential for innovation. New chemicals, materials and devices are key to a sustainable future, both environmentally and financially. The UK needs to invent its way out of seemingly conflicting targets of maintaining economic growth whilst making unprecedented strides towards an imminent net zero carbon output and PSDI will be the enabling vehicle for this approach.
This phase of PSDI builds on the results of the PSDI pilot project which ran from November 2021 to March 2022 (www.psdi.ac.uk). In this second phase of PSDI, we will begin to implement the recommendations that were developed during the PSDI pilot phase. We will commence development of the PSDI "Hub" and a number of "Pathfinders" which will seed the population of the Hub. This Phase will also continue community engagement activities and initialise a community governance mechanism for PSDI, as well as exploring and evaluating possible future pathfinders.
Publications

Bicarregui J
(2023)
Connecting Infrastructures: The Physical Sciences Data Infrastructure (PSDI) in the UK
in Proceedings of the Conference on Research Data Infrastructure


Title | Data Revival - Making the intangible tangible: The journey from lab notebook to digital insight - webinar |
Description | In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Data Revival webinar presented by Samuel Munday on 16th November 2023. https://www.psdi.ac.uk/event/webinar-data-revival/ We discuss the process of digitising an archive of laboratory notebooks effectively, the AI tools we have created to work with such unstructured knowledge at scale, the utility of the digital database created for the chemistry department, and the feedback received from the department on the system's potential for further development. |
Type Of Art | Film/Video/Animation |
Year Produced | 2024 |
Impact | Further enquiries about the Pathfinder activities in the PSDI initiative. |
URL | https://youtu.be/yq-lhlYbJ4U |
Title | Introduction to PSDI: Webinar |
Description | The Physical Sciences Data Infrastructure (PSDI) is an initiative funded by EPSRC which aims to accelerate research in the physical sciences by providing a data infrastructure that brings together and builds upon the various data systems researchers currently use. This video presents a recording of the Introduction to PSDI webinar which was run on 29th June 2023. https://www.psdi.ac.uk/event/webinar-introduction-to-psdi/ |
Type Of Art | Film/Video/Animation |
Year Produced | 2023 |
Impact | Further enquiries in the PSDI initiative |
URL | https://youtu.be/iOg8YSE-A7I |
Title | PSDI Pathfinders: Data Capture in Catalysis - webinar |
Description | In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 1: Experimental Data Capture in Catalysis webinar presented by Abraham Nieva de la Hidalga which was run on 3rd October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf1/ We demonstrate two techniques for processing and analysing data that generate the required metadata to create FAIR digital objects. |
Type Of Art | Film/Video/Animation |
Year Produced | 2023 |
Impact | Further enquiries about the Pathfinder 1 activities in the PSDI initiative. In particular expansion of the work to different analytical techniques. |
URL | https://youtu.be/hKMhO1_xUtE |
Title | PSDI Pathfinders: FAIR Data for the Biomolecular Simulation Community - webinar |
Description | In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 4: FAIR Data for the Biomolecular Simulation Community webinar presented by James Gebbie-Rayet and Jas Kalayan which was run on 18th October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf4/ We present possible solutions to address these two problems; firstly, a software tool to record data provenance towards FAIR compliant formats, and the other an online data repository to store and share this data. |
Type Of Art | Film/Video/Animation |
Year Produced | 2024 |
Impact | Further enquiries about the Pathfinder 4 activities in the PSDI initiative, reaching out to the wider biomolecular simulation community at different institutions. |
URL | https://youtu.be/FA_rVv-hZig |
Title | PSDI Pathfinders: Process Recording - Webinar |
Description | In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 2: Process Recording webinar presented by Dr. Samantha Kanza which was run on 27th July 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf2/ This discusses the shift in software offerings and attitudes to process recording software and report on the results of a recent survey on ELN and Notebook Usage in our physical sciences community. |
Type Of Art | Film/Video/Animation |
Year Produced | 2023 |
Impact | Further enquiries about the Pathfinder 2 activities in the PSDI initiative |
URL | https://youtu.be/r2Hre41xJSk |
Description | Physical Sciences Data Infrastructure Phase 1b - Southampton Extension |
Amount | £2,146,968 (GBP) |
Funding ID | EP/X032701/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2023 |
End | 12/2024 |
Description | AI4Green (University of Nottingham) |
Organisation | University of Nottingham |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We are helping the AI4Green team with study of the implementation of the AI4Green ELN in the undergraduate and postgraduate labs at the University of Nottingham |
Collaborator Contribution | The AI4Green team are letting us use them as a case study for our research in exchange for help with their ELN implementation. They are also providing us with intel on their OneNote lab books. |
Impact | - Magazine article in Lab Horizons detailing the initial visit - Led to new collaboration with Splashlake to explore the use of OneNote as an ELN Further outputs will include publications on the outcomes of this case study |
Start Year | 2023 |
Description | Catalysis Hub |
Organisation | Research Complex at Harwell |
Department | UK Catalysis Hub |
Country | United Kingdom |
Sector | Public |
PI Contribution | PSDI has provided expertise with metadata and software development, in particular with the Galaxy workflow platform. |
Collaborator Contribution | CatalysisHub has provided technique specific knowledge, domain specific knowledge and access to the community. |
Impact | - Contribution to the PSDI webinar series. Engagement event - Experimental Data Capture: producing publish ready data from processing and analysis processes, example with XAS data processing. |
Start Year | 2022 |
Title | Aiida-gromacs plugin |
Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. The design pattern we are aiming for is to simply allow researchers to capture the full data provenance for their simulations by only switching on an AiiDA conda environment, along with modifying your command lines very slightly. |
Type Of Technology | Software |
Year Produced | 2024 |
Open Source License? | Yes |
Impact | The implementation of this plugin means researchers gain access to powerful FAIR data practices without wholesale cultural or usage pattern shifts in their daily work. |
URL | https://aiida-gromacs.readthedocs.io/en/latest/ |
Description | AI4SD, PSDS & PSDI Skills4Scientists Series |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Undergraduate students |
Results and Impact | This series was organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Physical Sciences Data-Science Service (PSDS), and the Physical Sciences Data Infrastructure (PSDI). This series was initially run over summer 2021 and aimed to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. The first iteration of this series was primarily aimed at final year undergraduates / early stage PhD students. This series has now been run again in 2022 and 2023 and is in further development for 2024 to create a flipped/blended learning course, and to make a wide range of materials available online alongside the initial video content. |
Year(s) Of Engagement Activity | 2021,2022,2023 |
URL | https://eprints.soton.ac.uk/453198/ |
Description | Machine Learning for Atomistic Modelling Autumn School 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | This was a three day event that took place in person at the Daresbury Laboratory. It was a machine learning for materials training course that was run by the Physical Sciences Data Infrastructure (PSDI) initiative in collaboration with PSDS, AI4SD, STFC-SCD and CCP5.This training was targeted towards PhD students, in particular those in the Materials and Molecular Simulations field. The aim of this training was to introduce attendees to the latest methods of machine learning applied to atomistic simulation of materials. This training encompassed a number of talks and practical sessions, focusing on the basics of machine learning, machine learning interatomic potentials and graph neural networks. There was also an opportunity for attendees to present a poster on their work. Overall the school was very well received with requests to run it as a yearly event. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.psdi.ac.uk/event/machine-learning-autumn-school-2023/ |
Description | Skills4Scientists training series - 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Undergraduate students |
Results and Impact | Skills4Scientists training programme delivered to the 2023 internship cohort run by PSDI at the University of Southampton. This included technical and research skills sessions. This boosted the confidence of the students and informed them about further research work. One student in particular went on to start a PhD as a result of this internship work. |
Year(s) Of Engagement Activity | 2023 |
Description | Webinar Series |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A webinar series was setup to communicate the PSDI work with the community, and get input from them. So far 6 webinars have been run with over 180 attendees. These webinars have covered the breath of our pathfinder activities. They have also been recorded and have over 200 additional views on Youtube. |
Year(s) Of Engagement Activity | 2023,2024 |
URL | http://www.psdi.ac.uk/events |