📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

PSDI Phase 1b

Lead Research Organisation: Science and Technology Facilities Council
Department Name: Scientific Computing Department

Abstract

PSDI is a key enabler of Digital Chemistry and Materials Discovery, providing a platform to underpin the role of digital technologies and AI in enabling discovery across the Physical Sciences, and linking to data infrastructures in other domains. Through PSDI, researchers will be able to leverage the combination of the transformative potential of digital technologies with molecular and materials science principles to support their everyday working practice whilst at the same time accelerating discovery and innovation. PSDI will help drive the missions to achieve a Net Zero Chemicals Sector by 2041; reimagine materials discovery to accelerate technologies for Net Zero; and optimise drug discovery beyond small molecules.

Today, each physical science research infrastructure, from individual laboratories to large facilities, has essentially its own isolated data ecosystem which are often bespoke with varying degrees of management. In contrast, many other domains have data-centric infrastructures for collecting and reusing data which act as community hubs and drivers of new methods and discoveries. There is a clear need within physical sciences for an additional infrastructure layer to enable researchers to acquire, analyse and share their data in addition to searching, using and aggregating a wide range of existing resources whilst ensuring that each dataset can remain dedicated to its specific application.

There is a need to preserve and exploit outputs from past research while keeping pace with the increasing rate of data generation, the latter posing the greatest challenge and potential for innovation. New chemicals, materials and devices are key to a sustainable future, both environmentally and financially. The UK needs to invent its way out of seemingly conflicting targets of maintaining economic growth whilst making unprecedented strides towards an imminent net zero carbon output and PSDI will be the enabling vehicle for this approach.

This phase of PSDI builds on the results of the PSDI pilot project which ran from November 2021 to March 2022 (www.psdi.ac.uk). In this second phase of PSDI, we will begin to implement the recommendations that were developed during the PSDI pilot phase. We will commence development of the PSDI "Hub" and a number of "Pathfinders" which will seed the population of the Hub. This Phase will also continue community engagement activities and initialise a community governance mechanism for PSDI, as well as exploring and evaluating possible future pathfinders.
 
Title Data Revival - Making the intangible tangible: The journey from lab notebook to digital insight - webinar 
Description In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Data Revival webinar presented by Samuel Munday on 16th November 2023. https://www.psdi.ac.uk/event/webinar-data-revival/ We discuss the process of digitising an archive of laboratory notebooks effectively, the AI tools we have created to work with such unstructured knowledge at scale, the utility of the digital database created for the chemistry department, and the feedback received from the department on the system's potential for further development. 
Type Of Art Film/Video/Animation 
Year Produced 2024 
Impact Further enquiries about the Pathfinder activities in the PSDI initiative. 
URL https://youtu.be/yq-lhlYbJ4U
 
Title Introduction to PSDI: Webinar 
Description The Physical Sciences Data Infrastructure (PSDI) is an initiative funded by EPSRC which aims to accelerate research in the physical sciences by providing a data infrastructure that brings together and builds upon the various data systems researchers currently use. This video presents a recording of the Introduction to PSDI webinar which was run on 29th June 2023. https://www.psdi.ac.uk/event/webinar-introduction-to-psdi/ 
Type Of Art Film/Video/Animation 
Year Produced 2023 
Impact Further enquiries in the PSDI initiative 
URL https://youtu.be/iOg8YSE-A7I
 
Title PSDI Pathfinders: Data Capture in Catalysis - webinar 
Description In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 1: Experimental Data Capture in Catalysis webinar presented by Abraham Nieva de la Hidalga which was run on 3rd October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf1/ We demonstrate two techniques for processing and analysing data that generate the required metadata to create FAIR digital objects. 
Type Of Art Film/Video/Animation 
Year Produced 2023 
Impact Further enquiries about the Pathfinder 1 activities in the PSDI initiative. In particular expansion of the work to different analytical techniques. 
URL https://youtu.be/hKMhO1_xUtE
 
Title PSDI Pathfinders: FAIR Data for the Biomolecular Simulation Community - webinar 
Description In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 4: FAIR Data for the Biomolecular Simulation Community webinar presented by James Gebbie-Rayet and Jas Kalayan which was run on 18th October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf4/ We present possible solutions to address these two problems; firstly, a software tool to record data provenance towards FAIR compliant formats, and the other an online data repository to store and share this data. 
Type Of Art Film/Video/Animation 
Year Produced 2024 
Impact Further enquiries about the Pathfinder 4 activities in the PSDI initiative, reaching out to the wider biomolecular simulation community at different institutions. 
URL https://youtu.be/FA_rVv-hZig
 
Title PSDI Pathfinders: Process Recording - Webinar 
Description In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 2: Process Recording webinar presented by Dr. Samantha Kanza which was run on 27th July 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf2/ This discusses the shift in software offerings and attitudes to process recording software and report on the results of a recent survey on ELN and Notebook Usage in our physical sciences community. 
Type Of Art Film/Video/Animation 
Year Produced 2023 
Impact Further enquiries about the Pathfinder 2 activities in the PSDI initiative 
URL https://youtu.be/r2Hre41xJSk
 
Description Physical Sciences SAT - Jeremy
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Physical Sciences Data Infrastructure Phase 1b - Southampton Extension
Amount £2,146,968 (GBP)
Funding ID EP/X032701/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 09/2023 
End 12/2024
 
Title Coarse-grained MD simulation provenance of membrane embedded GPCR using GROMACS and aiida-gromacs 
Description Example setup of a martini2 coarse-grained molecular dynamics simulation of the active state PTH2R (Parathyroid hormone receptor type 2) protein embedded in a lipid bilayer membrane along with water and counter-ions. Command-line tools provided in aiida-gromacs are used to track each step performed on the terminal. The data: The files in this zenodo record are: 1144.dot.pdf: The image of the graph representation for the simulation workflow. gpcr_all_steps.aiida: The aiida archive file where all simulation setup steps performed in this work are packaged. run_commands.sh: The script used to produce the provenance data for this system. inputs_only.zip: Simulation setup input files for the 'run_commands.sh' script. The output files produced from the aiida-gromacs provenance tool used to collect the simulation steps for the are in the following files: 1_protein.zip: Output files for the retrieval and cleaning up of the protein structure. 2_martinize.zip: Output files for the protein coarse graining steps. 3_insane.zip: Output files for building the lipid bilayer membrane around the coarse-grained protein. 4_gromacs.zip: Output files for the ionisation steps in gromacs. 5_gromacs.zip: Output files for minimisation, equilibration and production simulation of the simulated system. The stripped trajectory for the active and inactive states of PTH2R are in the following files: R1a.pdb and R1a.xtc: pdb and trajectory of coordinates for the active state protein. R1i.pdb and R1i.xtc: pdb and trajectory of coordinates for the inactive state protein. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.14359055
 
Title Coarse-grained MD simulation provenance of membrane embedded GPCR using GROMACS and aiida-gromacs 
Description Example setup of a martini2 coarse-grained molecular dynamics simulation of the active state PTH2R (Parathyroid hormone receptor type 2) protein embedded in a lipid bilayer membrane along with water and counter-ions. Command-line tools provided in aiida-gromacs are used to track each step performed on the terminal. The data: The files in this zenodo record are: 1144.dot.pdf: The image of the graph representation for the simulation workflow. gpcr_all_steps.aiida: The aiida archive file where all simulation setup steps performed in this work are packaged. run_commands.sh: The script used to produce the provenance data for this system. inputs_only.zip: Simulation setup input files for the 'run_commands.sh' script. The output files produced from the aiida-gromacs provenance tool used to collect the simulation steps for the are in the following files: 1_protein.zip: Output files for the retrieval and cleaning up of the protein structure. 2_martinize.zip: Output files for the protein coarse graining steps. 3_insane.zip: Output files for building the lipid bilayer membrane around the coarse-grained protein. 4_gromacs.zip: Output files for the ionisation steps in gromacs. 5_gromacs.zip: Output files for minimisation, equilibration and production simulation of the simulated system. The stripped trajectory for the active and inactive states of PTH2R are in the following files: R1a.pdb and R1a.xtc: pdb and trajectory of coordinates for the active state protein. R1i.pdb and R1i.xtc: pdb and trajectory of coordinates for the inactive state protein. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.14359056
 
Title Data accessibility in the chemical sciences: an analysis of recent practice in organic chemistry journals 
Description Data is the analysis of the data outputs of 240 randomly selected research papers from 12 top-ranked journals published in early 2023. We investigate author compliance with recommended (but not compulsory) data policies, whether there is evidence to suggest that authors apply FAIR data guidance in their data publishing, and if the existence of specific recommendations for publishing NMR data by some journals encourages compliance. Files in the data package have been provided in both human and machine-readable forms. The main dataset is available in the Excel file Data worksheet.XLSX, the contents of which can also be found in Main_dataset.CSV, Data_types.CSV, and Article_selection.CSV with explanations of the variable coding used in the studies in Variable_names.CSV, Codes.CSV, and FAIR_variable_coding.CSV. The R code used for the article selection can be found in Article_selection.R. Data about article types from the journals that contain original research data is in Article_types.CSV. Data collected for analysis in our sister paper[4] can be found in Extended_Adherence.CSV, Extended_Crystallography.CSV, Extended_DAS.CSV, Extended_File_Types.CSV, and Extended_Submission_Process.CSV. A full list of files in the data package and a short description for each is given in README.TXT. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact No yet aware of impacts 
URL https://zenodo.org/doi/10.5281/zenodo.13928084
 
Title SlimMD 
Description This database of molecular dynamics trajectories has been designed to provide a light-weight set of trajectories for use in training and teaching materials. All MD trajectories in this database will have a simplified description, have a maximum file size of 1GB and will all have been tested and known to work with a vanilla install of VMD. There are a wide range of biological systems with interesting features observed in real simulations conducted by various community members. Currently this dataset resides on the CCPBioSim website, but this has been included into the PSDI's BioSimDB service due to launch in March 2025. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact This database has had many uses in educational settings based on feedback given by universities that are making use of them. With collective downloads of all systems in the database of approximately 15,000 unique and non bot downloads. 
URL https://ccpbiosim.ac.uk/slim-md
 
Description AI4Green (University of Nottingham) 
Organisation University of Nottingham
Country United Kingdom 
Sector Academic/University 
PI Contribution We are helping the AI4Green team with study of the implementation of the AI4Green ELN in the undergraduate and postgraduate labs at the University of Nottingham
Collaborator Contribution The AI4Green team are letting us use them as a case study for our research in exchange for help with their ELN implementation. They are also providing us with intel on their OneNote lab books.
Impact - Magazine article in Lab Horizons detailing the initial visit - Led to new collaboration with Splashlake to explore the use of OneNote as an ELN Further outputs will include publications on the outcomes of this case study
Start Year 2023
 
Description Catalysis Hub 
Organisation Research Complex at Harwell
Department UK Catalysis Hub
Country United Kingdom 
Sector Public 
PI Contribution PSDI has provided expertise with metadata and software development, in particular with the Galaxy workflow platform.
Collaborator Contribution CatalysisHub has provided technique specific knowledge, domain specific knowledge and access to the community.
Impact - Contribution to the PSDI webinar series. Engagement event - Experimental Data Capture: producing publish ready data from processing and analysis processes, example with XAS data processing.
Start Year 2022
 
Description DCC 
Organisation University of Edinburgh
Department Digital Curation Centre (DCC)
Country United Kingdom 
Sector Academic/University 
PI Contribution Organisation of community events, expert domain knowledge within multiple areas of the physical sciences.
Collaborator Contribution Expertise in in digital information curation with a focus on building capacity, capability, and skills for research data management and data sharing. They helped elicit input from the community, synthesizing community responses to generate recommendations as to how PSDI could address these challenges and concerns.
Impact Community data workshops, including reports.
Start Year 2023
 
Title Aiida-gromacs plugin 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. The design pattern we are aiming for is to simply allow researchers to capture the full data provenance for their simulations by only switching on an AiiDA conda environment, along with modifying your command lines very slightly. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact The implementation of this plugin means researchers gain access to powerful FAIR data practices without wholesale cultural or usage pattern shifts in their daily work. 
URL https://aiida-gromacs.readthedocs.io/en/latest/
 
Title PSDI AiiDA GPCR Workshop Container 
Description This container is derived from the CCPBioSim JupyterHub image. This container adds the necessary software packages and notebook content to form a deployable course container. The source content for this course can be found at https://github.com/PSDI-UK/aiida-gromacs 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This training container forms part of a fully automated, self updating, self healing and auto scaling kubernetes based training infrastructure. This means the CCPBioSim training materials are always bleeding edge and tested using CI based methods before auto deployment using CD. The infrastructure is version controlled using a gitops approach, which means that we get all the benefits of automation without the huge time penalty in maintenance. 
URL https://github.com/jimboid/biosim-aiida-gpcr-workshop
 
Title PSDI AiiDA Lysozyme Workshop Container 
Description This container is derived from the CCPBioSim JupyterHub image. This container adds the necessary software packages and notebook content to form a deployable course container. The source content for this course can be found at https://github.com/PSDI-UK/aiida-gromacs 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This training container forms part of a fully automated, self updating, self healing and auto scaling kubernetes based training infrastructure. This means the CCPBioSim training materials are always bleeding edge and tested using CI based methods before auto deployment using CD. The infrastructure is version controlled using a gitops approach, which means that we get all the benefits of automation without the huge time penalty in maintenance. 
URL https://github.com/jimboid/biosim-aiida-lysozyme-workshop
 
Title aiida-amber v0.1.0 
Description The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-amber
 
Title aiida-amber v1.0.0 
Description The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-amber
 
Title aiida-amber v2.0.1 
Description The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-amber
 
Title aiida-gromacs v2.0.1 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.10 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.2 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.3 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.4 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.5 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.6 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.7 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.8 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title aiida-gromacs v2.0.9 
Description The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. 
URL https://github.com/PSDI-UK/aiida-gromacs
 
Title janus-core 
Description Tools for machine learnt interatomic potentials 
Type Of Technology Software 
Year Produced 2025 
Open Source License? Yes  
Impact tool that support multiple machine lernt interatomic potentials. 
URL https://zenodo.org/doi/10.5281/zenodo.14962154
 
Title stfc/aiida-mlip: v0.2.1 
Description machine learning interatomic potentials aiida plugin 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
Impact aiida mlip plugin for janus-core 
URL https://zenodo.org/doi/10.5281/zenodo.11545400
 
Description 9th Annual CCPBioSim Conference talk on BioSimDB 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Jas Kalayan gave a talk on the current progress of developments of the data tools and infrastructure for biomolecular simulations taking place in the PSDI. Followed by an expert panel to discuss current practice in data and HPC.
Year(s) Of Engagement Activity 2023
URL https://www.ccpbiosim.ac.uk/events/upcoming-events/eventdetail/95/-/9th-annual-ccpbiosim-conference-...
 
Description AI4SD, PSDS & PSDI Skills4Scientists Series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact This series was organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Physical Sciences Data-Science Service (PSDS), and the Physical Sciences Data Infrastructure (PSDI). This series was initially run over summer 2021 and aimed to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. The first iteration of this series was primarily aimed at final year undergraduates / early stage PhD students.

This series has now been run again in 2022 and 2023 and is in further development for 2024 to create a flipped/blended learning course, and to make a wide range of materials available online alongside the initial video content.
Year(s) Of Engagement Activity 2021,2022,2023
URL https://eprints.soton.ac.uk/453198/
 
Description CCP-NC Advanced Materials Search Tool Workshop 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The intended workshop outputs are below:
1. Capture Collaborative Computational Project for Nuclear Magnetic Resonance (NMR) Crystallography's (CCP-NC) use cases and establish metadata requirements to facilitate advanced searching capabilities such as sub-structure, super-structure, and similarity searching in the CCP-NC Magres database.
2. Identify best practices for employing standard chemical notations to ensure seamless interoperability.
3. Evaluate pre-built user assistance tools, such as Chemdoodle, to enhance the user interface of the CCP-NC advanced search tool.
4. Develop a roadmap for extending NOMAD's OPTIMADE API to expose CCP-NC metadata fields via a fully functional API endpoint.
5. Initial planning for integrating CCP-NC's Magres database and Cambridge Crystallographic Data Centre's (CCDC) Cambridge Structural Database (CSD) through PSDI.

The 'Advanced Materials Search Tool for CCP-NC - Planning Workshop' brought together experts from across the computational chemistry and materials science communities - Collaborative Computational Project for NMR Crystallography (CCP-NC), Cambridge Crystallographic Data Centre (CCDC), and Physical Sciences Data Infrastructure (PSDI) - to discuss best practices and future developments in cheminformatics for solid-state NMR crystallography. It was a highly productive event, with the discussions laying the groundwork for improving metadata standards, enhancing search functionalities, and integrating advanced molecular representations into the existing CCP-NC infrastructure. The outcomes of this workshop will directly shape the next phase of development, guiding the creation of a scientifically rigorous and user-focused search tool and Magres database (version 2), while also strengthening collaboration between CCP-NC, CCDC, and PSDI.
Year(s) Of Engagement Activity 2025
 
Description Community Data workshops (Southampton, Edinburgh) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Approximately 30 research professionals (researchers, research support staff) attended 2 workshops to better understand the current challenges and opportunities around data sharing, as well as to gather requirements for the PSDI platform to facilitate such data sharing and a cross-discipline, cross-sector collaboration more broadly. A report was written about the workshops and the findings, which have also been incorporated at future workshops.
Year(s) Of Engagement Activity 2024
 
Description Community consultation trusted data resources 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Research Consulting, in partnership with Jisc, is undertaking community consultation exploring potential licensing options to enable continued access to trusted data sources for the UK physical sciences academic community beyond January 2026 and how this could be accomplished through the merging of PSDS and PSDI. The findings will inform recommendations and conclusions regarding options for future licensing models. Consultation has been undertaken with database providers, service providers, data source users and librarians.

This activity is still ongoing and will produce an options analysis and recommendations to be taken to the funding council.
Year(s) Of Engagement Activity 2024,2025
 
Description IDCC 2025 workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 30 data stewards and attendees from related roles attended the PSDI workshop at the international digital curation conference 2025. This workshop focused on the topic creating communities around best practices and common challenges in data. The workshop provided a brief background on the aims of the PSDI project and the communities we are currently engaging with and who we hope to work with in the future. Group discussions were had to talk about the challenges, needs and potential solutions raised and how PSDI might be able to help and provide a platform for sharing tools and best practices across the community.

A report is currently in preparation.
Year(s) Of Engagement Activity 2025
 
Description MDDB and EBI working group on data collection for biomolecular simulation - Oxford 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This meeting was a meeting held by the EU funded MDDB project along with the EBI to bring together key stakeholders in Europe concerned about data collection in the biomolecular simulation domain. Jas Kalayan attended the meeting and presented the PSDI BioSimDB technology and roadmap. Agreements were made to cooperate between MDDB and PSDI.
Year(s) Of Engagement Activity 2023
 
Description MDDB and MD software developers' workshop - Oxford 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact An MDDB arranged workshop focussed on developers of software within the biomolecular simulation field. The focus was on data outputs of tools and how to establish standards for metadata.
Year(s) Of Engagement Activity 2023
 
Description Machine Learning for Atomistic Modelling Autumn School 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This was a three day event that took place in person at the Daresbury Laboratory. It was a machine learning for materials training course that was run by the Physical Sciences Data Infrastructure (PSDI) initiative in collaboration with PSDS, AI4SD, STFC-SCD and CCP5.This training was targeted towards PhD students, in particular those in the Materials and Molecular Simulations field. The aim of this training was to introduce attendees to the latest methods of machine learning applied to atomistic simulation of materials.
This training encompassed a number of talks and practical sessions, focusing on the basics of machine learning, machine learning interatomic potentials and graph neural networks. There was also an opportunity for attendees to present a poster on their work. Overall the school was very well received with requests to run it as a yearly event.
Year(s) Of Engagement Activity 2023
URL https://www.psdi.ac.uk/event/machine-learning-autumn-school-2023/
 
Description PSDI Townhall Meeting 2024 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact 60 representatives from across the national physical sciences research community attended the PSDI townhall meeting. The first PSDI Townhall was organized to foster engagement and collaboration among stakeholders in the physical sciences data infrastructure community, and provide a mechanism for researchers, data scientists, and policymakers to exchange ideas and contribute to the evolution of an effective data infrastructure for the Physical Sciences. Attendance at the townhall was open to all interested parties and targeted invitations were directed to representatives across large scale facilities, related infrastructure projects, computational initiatives and EPSRC funded projects to ensure diverse representation.

The townhall event showcased some of the development activities in PSDI, gathering feedback from the community and providing information on how to get more involved with PSDI activities. A report was written from the event alongside recordings from the demonstrators.
Year(s) Of Engagement Activity 2024
 
Description Panel Session at RSECon2022 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Hosted an expert panel and discussion session at the national research software engineering conference RSECon. Discussion on the data needs for the long tail conmmunity.
Year(s) Of Engagement Activity 2022
 
Description Presentation at AI4SD Annual Conference "PSDI - Shaping the Physical Sciences Roadmap" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Invited to present on the PSDI project and the landscape for physical sciences data. Interactive engagement with the hybrid audience of over 100 to discuss their data needs and requirements. This was also published as a video on the Organisers YouTube channel and has over 300 views.
Year(s) Of Engagement Activity 2022
URL https://www.youtube.com/watch?v=4Ukn7TawAhs&list=PLyeHH3bEQqIYYcv2ZmgJ50wCaOreX8Dvn&index=23
 
Description Requirements analysis with National Research Facilities 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Representatives from National Research Facilities in the UK joined a workshop with PSDI to discuss their data needs and requirements. Several sessions were run with active discussion among participants. Follow up discussion has been had about further activities to be explored with PSDI.
Year(s) Of Engagement Activity 2022
 
Description Research Data in the physical sciences 2025 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The research data in the physical sciences workshop was an in-person forum designed for data librarians and research support professionals working within the physical sciences across the UK and EU. Approx. 35 people working across this area attended this 2 day workshop. The workshop provided opportunity for knowledge exchange, community discussions, and networking opportunities, all centered on the challenges, opportunities, and emerging solutions in research data management for the physical sciences through keynote presentations, poster sessions, lightning talks, demonstrations, and interactive discussions.

This workshop provided lots of insight into the different roles within many universities and projects and has sparked multiple discussions about future events, collaborations and avenues of work, both for PSDI and other projects.
Year(s) Of Engagement Activity 2025
 
Description Skills4Scientists 2024 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Skills4Scientists training programme delivered by PSDI and PSDS to the 2024 internship cohort run by PSDI, University of Southampton. This included technical and research skills sessions. This boosted the confidence of the students and informed them about further research work. Several interns went on to carry out further research work for research groups / spin out companies within the university.
Year(s) Of Engagement Activity 2024
 
Description Skills4Scientists training series - 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Skills4Scientists training programme delivered to the 2023 internship cohort run by PSDI at the University of Southampton. This included technical and research skills sessions. This boosted the confidence of the students and informed them about further research work. One student in particular went on to start a PhD as a result of this internship work.
Year(s) Of Engagement Activity 2023
 
Description Webinar Series 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A webinar series was setup to communicate the PSDI work with the community, and get input from them. So far 13 webinars have been run with over 350 attendees. These webinars have covered the breath of our pathfinder activities. They have also been recorded and have over 800 additional views on Youtube.
Year(s) Of Engagement Activity 2023,2024,2025
URL http://www.psdi.ac.uk/events