PSDI Phase 1b
Lead Research Organisation:
Science and Technology Facilities Council
Department Name: Scientific Computing Department
Abstract
PSDI is a key enabler of Digital Chemistry and Materials Discovery, providing a platform to underpin the role of digital technologies and AI in enabling discovery across the Physical Sciences, and linking to data infrastructures in other domains. Through PSDI, researchers will be able to leverage the combination of the transformative potential of digital technologies with molecular and materials science principles to support their everyday working practice whilst at the same time accelerating discovery and innovation. PSDI will help drive the missions to achieve a Net Zero Chemicals Sector by 2041; reimagine materials discovery to accelerate technologies for Net Zero; and optimise drug discovery beyond small molecules.
Today, each physical science research infrastructure, from individual laboratories to large facilities, has essentially its own isolated data ecosystem which are often bespoke with varying degrees of management. In contrast, many other domains have data-centric infrastructures for collecting and reusing data which act as community hubs and drivers of new methods and discoveries. There is a clear need within physical sciences for an additional infrastructure layer to enable researchers to acquire, analyse and share their data in addition to searching, using and aggregating a wide range of existing resources whilst ensuring that each dataset can remain dedicated to its specific application.
There is a need to preserve and exploit outputs from past research while keeping pace with the increasing rate of data generation, the latter posing the greatest challenge and potential for innovation. New chemicals, materials and devices are key to a sustainable future, both environmentally and financially. The UK needs to invent its way out of seemingly conflicting targets of maintaining economic growth whilst making unprecedented strides towards an imminent net zero carbon output and PSDI will be the enabling vehicle for this approach.
This phase of PSDI builds on the results of the PSDI pilot project which ran from November 2021 to March 2022 (www.psdi.ac.uk). In this second phase of PSDI, we will begin to implement the recommendations that were developed during the PSDI pilot phase. We will commence development of the PSDI "Hub" and a number of "Pathfinders" which will seed the population of the Hub. This Phase will also continue community engagement activities and initialise a community governance mechanism for PSDI, as well as exploring and evaluating possible future pathfinders.
Today, each physical science research infrastructure, from individual laboratories to large facilities, has essentially its own isolated data ecosystem which are often bespoke with varying degrees of management. In contrast, many other domains have data-centric infrastructures for collecting and reusing data which act as community hubs and drivers of new methods and discoveries. There is a clear need within physical sciences for an additional infrastructure layer to enable researchers to acquire, analyse and share their data in addition to searching, using and aggregating a wide range of existing resources whilst ensuring that each dataset can remain dedicated to its specific application.
There is a need to preserve and exploit outputs from past research while keeping pace with the increasing rate of data generation, the latter posing the greatest challenge and potential for innovation. New chemicals, materials and devices are key to a sustainable future, both environmentally and financially. The UK needs to invent its way out of seemingly conflicting targets of maintaining economic growth whilst making unprecedented strides towards an imminent net zero carbon output and PSDI will be the enabling vehicle for this approach.
This phase of PSDI builds on the results of the PSDI pilot project which ran from November 2021 to March 2022 (www.psdi.ac.uk). In this second phase of PSDI, we will begin to implement the recommendations that were developed during the PSDI pilot phase. We will commence development of the PSDI "Hub" and a number of "Pathfinders" which will seed the population of the Hub. This Phase will also continue community engagement activities and initialise a community governance mechanism for PSDI, as well as exploring and evaluating possible future pathfinders.
Publications
Batatia I
(2024)
A foundation model for atomistic materials chemistry
Bicarregui J
(2023)
Connecting Infrastructures: The Physical Sciences Data Infrastructure (PSDI) in the UK
in Proceedings of the Conference on Research Data Infrastructure
Chao K
(2024)
Human class B1 GPCR modulation by plasma membrane lipids
| Title | Data Revival - Making the intangible tangible: The journey from lab notebook to digital insight - webinar |
| Description | In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Data Revival webinar presented by Samuel Munday on 16th November 2023. https://www.psdi.ac.uk/event/webinar-data-revival/ We discuss the process of digitising an archive of laboratory notebooks effectively, the AI tools we have created to work with such unstructured knowledge at scale, the utility of the digital database created for the chemistry department, and the feedback received from the department on the system's potential for further development. |
| Type Of Art | Film/Video/Animation |
| Year Produced | 2024 |
| Impact | Further enquiries about the Pathfinder activities in the PSDI initiative. |
| URL | https://youtu.be/yq-lhlYbJ4U |
| Title | Introduction to PSDI: Webinar |
| Description | The Physical Sciences Data Infrastructure (PSDI) is an initiative funded by EPSRC which aims to accelerate research in the physical sciences by providing a data infrastructure that brings together and builds upon the various data systems researchers currently use. This video presents a recording of the Introduction to PSDI webinar which was run on 29th June 2023. https://www.psdi.ac.uk/event/webinar-introduction-to-psdi/ |
| Type Of Art | Film/Video/Animation |
| Year Produced | 2023 |
| Impact | Further enquiries in the PSDI initiative |
| URL | https://youtu.be/iOg8YSE-A7I |
| Title | PSDI Pathfinders: Data Capture in Catalysis - webinar |
| Description | In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 1: Experimental Data Capture in Catalysis webinar presented by Abraham Nieva de la Hidalga which was run on 3rd October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf1/ We demonstrate two techniques for processing and analysing data that generate the required metadata to create FAIR digital objects. |
| Type Of Art | Film/Video/Animation |
| Year Produced | 2023 |
| Impact | Further enquiries about the Pathfinder 1 activities in the PSDI initiative. In particular expansion of the work to different analytical techniques. |
| URL | https://youtu.be/hKMhO1_xUtE |
| Title | PSDI Pathfinders: FAIR Data for the Biomolecular Simulation Community - webinar |
| Description | In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 4: FAIR Data for the Biomolecular Simulation Community webinar presented by James Gebbie-Rayet and Jas Kalayan which was run on 18th October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf4/ We present possible solutions to address these two problems; firstly, a software tool to record data provenance towards FAIR compliant formats, and the other an online data repository to store and share this data. |
| Type Of Art | Film/Video/Animation |
| Year Produced | 2024 |
| Impact | Further enquiries about the Pathfinder 4 activities in the PSDI initiative, reaching out to the wider biomolecular simulation community at different institutions. |
| URL | https://youtu.be/FA_rVv-hZig |
| Title | PSDI Pathfinders: Process Recording - Webinar |
| Description | In the Physical Sciences Data Infrastructure (PSDI) Our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. This video presents a recording of the Pathfinder 2: Process Recording webinar presented by Dr. Samantha Kanza which was run on 27th July 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf2/ This discusses the shift in software offerings and attitudes to process recording software and report on the results of a recent survey on ELN and Notebook Usage in our physical sciences community. |
| Type Of Art | Film/Video/Animation |
| Year Produced | 2023 |
| Impact | Further enquiries about the Pathfinder 2 activities in the PSDI initiative |
| URL | https://youtu.be/r2Hre41xJSk |
| Description | Physical Sciences SAT - Jeremy |
| Geographic Reach | National |
| Policy Influence Type | Participation in a guidance/advisory committee |
| Description | Physical Sciences Data Infrastructure Phase 1b - Southampton Extension |
| Amount | £2,146,968 (GBP) |
| Funding ID | EP/X032701/1 |
| Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
| Sector | Public |
| Country | United Kingdom |
| Start | 09/2023 |
| End | 12/2024 |
| Title | Coarse-grained MD simulation provenance of membrane embedded GPCR using GROMACS and aiida-gromacs |
| Description | Example setup of a martini2 coarse-grained molecular dynamics simulation of the active state PTH2R (Parathyroid hormone receptor type 2) protein embedded in a lipid bilayer membrane along with water and counter-ions. Command-line tools provided in aiida-gromacs are used to track each step performed on the terminal. The data: The files in this zenodo record are: 1144.dot.pdf: The image of the graph representation for the simulation workflow. gpcr_all_steps.aiida: The aiida archive file where all simulation setup steps performed in this work are packaged. run_commands.sh: The script used to produce the provenance data for this system. inputs_only.zip: Simulation setup input files for the 'run_commands.sh' script. The output files produced from the aiida-gromacs provenance tool used to collect the simulation steps for the are in the following files: 1_protein.zip: Output files for the retrieval and cleaning up of the protein structure. 2_martinize.zip: Output files for the protein coarse graining steps. 3_insane.zip: Output files for building the lipid bilayer membrane around the coarse-grained protein. 4_gromacs.zip: Output files for the ionisation steps in gromacs. 5_gromacs.zip: Output files for minimisation, equilibration and production simulation of the simulated system. The stripped trajectory for the active and inactive states of PTH2R are in the following files: R1a.pdb and R1a.xtc: pdb and trajectory of coordinates for the active state protein. R1i.pdb and R1i.xtc: pdb and trajectory of coordinates for the inactive state protein. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| URL | https://zenodo.org/doi/10.5281/zenodo.14359055 |
| Title | Coarse-grained MD simulation provenance of membrane embedded GPCR using GROMACS and aiida-gromacs |
| Description | Example setup of a martini2 coarse-grained molecular dynamics simulation of the active state PTH2R (Parathyroid hormone receptor type 2) protein embedded in a lipid bilayer membrane along with water and counter-ions. Command-line tools provided in aiida-gromacs are used to track each step performed on the terminal. The data: The files in this zenodo record are: 1144.dot.pdf: The image of the graph representation for the simulation workflow. gpcr_all_steps.aiida: The aiida archive file where all simulation setup steps performed in this work are packaged. run_commands.sh: The script used to produce the provenance data for this system. inputs_only.zip: Simulation setup input files for the 'run_commands.sh' script. The output files produced from the aiida-gromacs provenance tool used to collect the simulation steps for the are in the following files: 1_protein.zip: Output files for the retrieval and cleaning up of the protein structure. 2_martinize.zip: Output files for the protein coarse graining steps. 3_insane.zip: Output files for building the lipid bilayer membrane around the coarse-grained protein. 4_gromacs.zip: Output files for the ionisation steps in gromacs. 5_gromacs.zip: Output files for minimisation, equilibration and production simulation of the simulated system. The stripped trajectory for the active and inactive states of PTH2R are in the following files: R1a.pdb and R1a.xtc: pdb and trajectory of coordinates for the active state protein. R1i.pdb and R1i.xtc: pdb and trajectory of coordinates for the inactive state protein. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| URL | https://zenodo.org/doi/10.5281/zenodo.14359056 |
| Title | Data accessibility in the chemical sciences: an analysis of recent practice in organic chemistry journals |
| Description | Data is the analysis of the data outputs of 240 randomly selected research papers from 12 top-ranked journals published in early 2023. We investigate author compliance with recommended (but not compulsory) data policies, whether there is evidence to suggest that authors apply FAIR data guidance in their data publishing, and if the existence of specific recommendations for publishing NMR data by some journals encourages compliance. Files in the data package have been provided in both human and machine-readable forms. The main dataset is available in the Excel file Data worksheet.XLSX, the contents of which can also be found in Main_dataset.CSV, Data_types.CSV, and Article_selection.CSV with explanations of the variable coding used in the studies in Variable_names.CSV, Codes.CSV, and FAIR_variable_coding.CSV. The R code used for the article selection can be found in Article_selection.R. Data about article types from the journals that contain original research data is in Article_types.CSV. Data collected for analysis in our sister paper[4] can be found in Extended_Adherence.CSV, Extended_Crystallography.CSV, Extended_DAS.CSV, Extended_File_Types.CSV, and Extended_Submission_Process.CSV. A full list of files in the data package and a short description for each is given in README.TXT. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | No yet aware of impacts |
| URL | https://zenodo.org/doi/10.5281/zenodo.13928084 |
| Title | SlimMD |
| Description | This database of molecular dynamics trajectories has been designed to provide a light-weight set of trajectories for use in training and teaching materials. All MD trajectories in this database will have a simplified description, have a maximum file size of 1GB and will all have been tested and known to work with a vanilla install of VMD. There are a wide range of biological systems with interesting features observed in real simulations conducted by various community members. Currently this dataset resides on the CCPBioSim website, but this has been included into the PSDI's BioSimDB service due to launch in March 2025. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2023 |
| Provided To Others? | Yes |
| Impact | This database has had many uses in educational settings based on feedback given by universities that are making use of them. With collective downloads of all systems in the database of approximately 15,000 unique and non bot downloads. |
| URL | https://ccpbiosim.ac.uk/slim-md |
| Description | AI4Green (University of Nottingham) |
| Organisation | University of Nottingham |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | We are helping the AI4Green team with study of the implementation of the AI4Green ELN in the undergraduate and postgraduate labs at the University of Nottingham |
| Collaborator Contribution | The AI4Green team are letting us use them as a case study for our research in exchange for help with their ELN implementation. They are also providing us with intel on their OneNote lab books. |
| Impact | - Magazine article in Lab Horizons detailing the initial visit - Led to new collaboration with Splashlake to explore the use of OneNote as an ELN Further outputs will include publications on the outcomes of this case study |
| Start Year | 2023 |
| Description | Catalysis Hub |
| Organisation | Research Complex at Harwell |
| Department | UK Catalysis Hub |
| Country | United Kingdom |
| Sector | Public |
| PI Contribution | PSDI has provided expertise with metadata and software development, in particular with the Galaxy workflow platform. |
| Collaborator Contribution | CatalysisHub has provided technique specific knowledge, domain specific knowledge and access to the community. |
| Impact | - Contribution to the PSDI webinar series. Engagement event - Experimental Data Capture: producing publish ready data from processing and analysis processes, example with XAS data processing. |
| Start Year | 2022 |
| Description | DCC |
| Organisation | University of Edinburgh |
| Department | Digital Curation Centre (DCC) |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | Organisation of community events, expert domain knowledge within multiple areas of the physical sciences. |
| Collaborator Contribution | Expertise in in digital information curation with a focus on building capacity, capability, and skills for research data management and data sharing. They helped elicit input from the community, synthesizing community responses to generate recommendations as to how PSDI could address these challenges and concerns. |
| Impact | Community data workshops, including reports. |
| Start Year | 2023 |
| Title | Aiida-gromacs plugin |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. The design pattern we are aiming for is to simply allow researchers to capture the full data provenance for their simulations by only switching on an AiiDA conda environment, along with modifying your command lines very slightly. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | The implementation of this plugin means researchers gain access to powerful FAIR data practices without wholesale cultural or usage pattern shifts in their daily work. |
| URL | https://aiida-gromacs.readthedocs.io/en/latest/ |
| Title | PSDI AiiDA GPCR Workshop Container |
| Description | This container is derived from the CCPBioSim JupyterHub image. This container adds the necessary software packages and notebook content to form a deployable course container. The source content for this course can be found at https://github.com/PSDI-UK/aiida-gromacs |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This training container forms part of a fully automated, self updating, self healing and auto scaling kubernetes based training infrastructure. This means the CCPBioSim training materials are always bleeding edge and tested using CI based methods before auto deployment using CD. The infrastructure is version controlled using a gitops approach, which means that we get all the benefits of automation without the huge time penalty in maintenance. |
| URL | https://github.com/jimboid/biosim-aiida-gpcr-workshop |
| Title | PSDI AiiDA Lysozyme Workshop Container |
| Description | This container is derived from the CCPBioSim JupyterHub image. This container adds the necessary software packages and notebook content to form a deployable course container. The source content for this course can be found at https://github.com/PSDI-UK/aiida-gromacs |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This training container forms part of a fully automated, self updating, self healing and auto scaling kubernetes based training infrastructure. This means the CCPBioSim training materials are always bleeding edge and tested using CI based methods before auto deployment using CD. The infrastructure is version controlled using a gitops approach, which means that we get all the benefits of automation without the huge time penalty in maintenance. |
| URL | https://github.com/jimboid/biosim-aiida-lysozyme-workshop |
| Title | aiida-amber v0.1.0 |
| Description | The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-amber |
| Title | aiida-amber v1.0.0 |
| Description | The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-amber |
| Title | aiida-amber v2.0.1 |
| Description | The AMBER plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-amber |
| Title | aiida-gromacs v2.0.1 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.10 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.2 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.3 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.4 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.5 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.6 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.7 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.8 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | aiida-gromacs v2.0.9 |
| Description | The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations. This plugin is being developed as part of the Physical Sciences Data Infrastructure programme to improve the practices around data within the Physical Sciences remit area within the UK. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | This plugin will enable researchers to create shareable and reproducible molecular dynamics simulations that can be submitted to data storage facilities and enable fully reproducible workflows. |
| URL | https://github.com/PSDI-UK/aiida-gromacs |
| Title | janus-core |
| Description | Tools for machine learnt interatomic potentials |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | tool that support multiple machine lernt interatomic potentials. |
| URL | https://zenodo.org/doi/10.5281/zenodo.14962154 |
| Title | stfc/aiida-mlip: v0.2.1 |
| Description | machine learning interatomic potentials aiida plugin |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | aiida mlip plugin for janus-core |
| URL | https://zenodo.org/doi/10.5281/zenodo.11545400 |
| Description | 9th Annual CCPBioSim Conference talk on BioSimDB |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Jas Kalayan gave a talk on the current progress of developments of the data tools and infrastructure for biomolecular simulations taking place in the PSDI. Followed by an expert panel to discuss current practice in data and HPC. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.ccpbiosim.ac.uk/events/upcoming-events/eventdetail/95/-/9th-annual-ccpbiosim-conference-... |
| Description | AI4SD, PSDS & PSDI Skills4Scientists Series |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Undergraduate students |
| Results and Impact | This series was organised as a joint venture between the Artificial Intelligence for Scientific Discovery Network+ (AI4SD), the Physical Sciences Data-Science Service (PSDS), and the Physical Sciences Data Infrastructure (PSDI). This series was initially run over summer 2021 and aimed to educate and improve scientists skills in a range of areas including research data management, python, version control, ethics, and career development. The first iteration of this series was primarily aimed at final year undergraduates / early stage PhD students. This series has now been run again in 2022 and 2023 and is in further development for 2024 to create a flipped/blended learning course, and to make a wide range of materials available online alongside the initial video content. |
| Year(s) Of Engagement Activity | 2021,2022,2023 |
| URL | https://eprints.soton.ac.uk/453198/ |
| Description | CCP-NC Advanced Materials Search Tool Workshop |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | The intended workshop outputs are below: 1. Capture Collaborative Computational Project for Nuclear Magnetic Resonance (NMR) Crystallography's (CCP-NC) use cases and establish metadata requirements to facilitate advanced searching capabilities such as sub-structure, super-structure, and similarity searching in the CCP-NC Magres database. 2. Identify best practices for employing standard chemical notations to ensure seamless interoperability. 3. Evaluate pre-built user assistance tools, such as Chemdoodle, to enhance the user interface of the CCP-NC advanced search tool. 4. Develop a roadmap for extending NOMAD's OPTIMADE API to expose CCP-NC metadata fields via a fully functional API endpoint. 5. Initial planning for integrating CCP-NC's Magres database and Cambridge Crystallographic Data Centre's (CCDC) Cambridge Structural Database (CSD) through PSDI. The 'Advanced Materials Search Tool for CCP-NC - Planning Workshop' brought together experts from across the computational chemistry and materials science communities - Collaborative Computational Project for NMR Crystallography (CCP-NC), Cambridge Crystallographic Data Centre (CCDC), and Physical Sciences Data Infrastructure (PSDI) - to discuss best practices and future developments in cheminformatics for solid-state NMR crystallography. It was a highly productive event, with the discussions laying the groundwork for improving metadata standards, enhancing search functionalities, and integrating advanced molecular representations into the existing CCP-NC infrastructure. The outcomes of this workshop will directly shape the next phase of development, guiding the creation of a scientifically rigorous and user-focused search tool and Magres database (version 2), while also strengthening collaboration between CCP-NC, CCDC, and PSDI. |
| Year(s) Of Engagement Activity | 2025 |
| Description | Community Data workshops (Southampton, Edinburgh) |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Approximately 30 research professionals (researchers, research support staff) attended 2 workshops to better understand the current challenges and opportunities around data sharing, as well as to gather requirements for the PSDI platform to facilitate such data sharing and a cross-discipline, cross-sector collaboration more broadly. A report was written about the workshops and the findings, which have also been incorporated at future workshops. |
| Year(s) Of Engagement Activity | 2024 |
| Description | Community consultation trusted data resources |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Research Consulting, in partnership with Jisc, is undertaking community consultation exploring potential licensing options to enable continued access to trusted data sources for the UK physical sciences academic community beyond January 2026 and how this could be accomplished through the merging of PSDS and PSDI. The findings will inform recommendations and conclusions regarding options for future licensing models. Consultation has been undertaken with database providers, service providers, data source users and librarians. This activity is still ongoing and will produce an options analysis and recommendations to be taken to the funding council. |
| Year(s) Of Engagement Activity | 2024,2025 |
| Description | IDCC 2025 workshop |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | 30 data stewards and attendees from related roles attended the PSDI workshop at the international digital curation conference 2025. This workshop focused on the topic creating communities around best practices and common challenges in data. The workshop provided a brief background on the aims of the PSDI project and the communities we are currently engaging with and who we hope to work with in the future. Group discussions were had to talk about the challenges, needs and potential solutions raised and how PSDI might be able to help and provide a platform for sharing tools and best practices across the community. A report is currently in preparation. |
| Year(s) Of Engagement Activity | 2025 |
| Description | MDDB and EBI working group on data collection for biomolecular simulation - Oxford |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | This meeting was a meeting held by the EU funded MDDB project along with the EBI to bring together key stakeholders in Europe concerned about data collection in the biomolecular simulation domain. Jas Kalayan attended the meeting and presented the PSDI BioSimDB technology and roadmap. Agreements were made to cooperate between MDDB and PSDI. |
| Year(s) Of Engagement Activity | 2023 |
| Description | MDDB and MD software developers' workshop - Oxford |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | An MDDB arranged workshop focussed on developers of software within the biomolecular simulation field. The focus was on data outputs of tools and how to establish standards for metadata. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Machine Learning for Atomistic Modelling Autumn School 2023 |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Postgraduate students |
| Results and Impact | This was a three day event that took place in person at the Daresbury Laboratory. It was a machine learning for materials training course that was run by the Physical Sciences Data Infrastructure (PSDI) initiative in collaboration with PSDS, AI4SD, STFC-SCD and CCP5.This training was targeted towards PhD students, in particular those in the Materials and Molecular Simulations field. The aim of this training was to introduce attendees to the latest methods of machine learning applied to atomistic simulation of materials. This training encompassed a number of talks and practical sessions, focusing on the basics of machine learning, machine learning interatomic potentials and graph neural networks. There was also an opportunity for attendees to present a poster on their work. Overall the school was very well received with requests to run it as a yearly event. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.psdi.ac.uk/event/machine-learning-autumn-school-2023/ |
| Description | PSDI Townhall Meeting 2024 |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | 60 representatives from across the national physical sciences research community attended the PSDI townhall meeting. The first PSDI Townhall was organized to foster engagement and collaboration among stakeholders in the physical sciences data infrastructure community, and provide a mechanism for researchers, data scientists, and policymakers to exchange ideas and contribute to the evolution of an effective data infrastructure for the Physical Sciences. Attendance at the townhall was open to all interested parties and targeted invitations were directed to representatives across large scale facilities, related infrastructure projects, computational initiatives and EPSRC funded projects to ensure diverse representation. The townhall event showcased some of the development activities in PSDI, gathering feedback from the community and providing information on how to get more involved with PSDI activities. A report was written from the event alongside recordings from the demonstrators. |
| Year(s) Of Engagement Activity | 2024 |
| Description | Panel Session at RSECon2022 |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Hosted an expert panel and discussion session at the national research software engineering conference RSECon. Discussion on the data needs for the long tail conmmunity. |
| Year(s) Of Engagement Activity | 2022 |
| Description | Presentation at AI4SD Annual Conference "PSDI - Shaping the Physical Sciences Roadmap" |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Other audiences |
| Results and Impact | Invited to present on the PSDI project and the landscape for physical sciences data. Interactive engagement with the hybrid audience of over 100 to discuss their data needs and requirements. This was also published as a video on the Organisers YouTube channel and has over 300 views. |
| Year(s) Of Engagement Activity | 2022 |
| URL | https://www.youtube.com/watch?v=4Ukn7TawAhs&list=PLyeHH3bEQqIYYcv2ZmgJ50wCaOreX8Dvn&index=23 |
| Description | Requirements analysis with National Research Facilities |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Representatives from National Research Facilities in the UK joined a workshop with PSDI to discuss their data needs and requirements. Several sessions were run with active discussion among participants. Follow up discussion has been had about further activities to be explored with PSDI. |
| Year(s) Of Engagement Activity | 2022 |
| Description | Research Data in the physical sciences 2025 |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | The research data in the physical sciences workshop was an in-person forum designed for data librarians and research support professionals working within the physical sciences across the UK and EU. Approx. 35 people working across this area attended this 2 day workshop. The workshop provided opportunity for knowledge exchange, community discussions, and networking opportunities, all centered on the challenges, opportunities, and emerging solutions in research data management for the physical sciences through keynote presentations, poster sessions, lightning talks, demonstrations, and interactive discussions. This workshop provided lots of insight into the different roles within many universities and projects and has sparked multiple discussions about future events, collaborations and avenues of work, both for PSDI and other projects. |
| Year(s) Of Engagement Activity | 2025 |
| Description | Skills4Scientists 2024 |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Undergraduate students |
| Results and Impact | Skills4Scientists training programme delivered by PSDI and PSDS to the 2024 internship cohort run by PSDI, University of Southampton. This included technical and research skills sessions. This boosted the confidence of the students and informed them about further research work. Several interns went on to carry out further research work for research groups / spin out companies within the university. |
| Year(s) Of Engagement Activity | 2024 |
| Description | Skills4Scientists training series - 2023 |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Undergraduate students |
| Results and Impact | Skills4Scientists training programme delivered to the 2023 internship cohort run by PSDI at the University of Southampton. This included technical and research skills sessions. This boosted the confidence of the students and informed them about further research work. One student in particular went on to start a PhD as a result of this internship work. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Webinar Series |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A webinar series was setup to communicate the PSDI work with the community, and get input from them. So far 13 webinars have been run with over 350 attendees. These webinars have covered the breath of our pathfinder activities. They have also been recorded and have over 800 additional views on Youtube. |
| Year(s) Of Engagement Activity | 2023,2024,2025 |
| URL | http://www.psdi.ac.uk/events |
