Embedding FAIRness in Plasma Science

Lead Research Organisation: University of York
Department Name: Physics

Abstract

Open science is perhaps best embodied by the FAIR principles for software and data: that they should be Findable, Accessible, Interoperable, and Reusable. When researchers make their code and data available for others to use, it becomes easier for others to verify results, as well as easier for others to build on and use to spur new research of their own. Alongside the FAIR principles is the idea of "sustainable" software, which is software that can continue to be used after its original intended purpose, remaining reliable and reproducible. Sustainable software is important for high quality research.

The goal of this Fellowship is to help researchers in plasma science overcome barriers to implementing these principles and ideas in their work, and bring about a cultural change to make sharing FAIR software and data the norm. I will do this by establishing a national network of research software engineers (RSEs) who will undertake efficient, wide-ranging improvements across the plasma science software ecosystem. The objective is not to make a single code massively better; it is to create and maintain an environment and philosophy that will benefit all plasma codes used in the UK -- "a rising tide lifts all boats".

In order to reach as much of the community as possible, this national network will focus on short usability and sustainability projects, along with training tailored to individual researchers and groups. This will be paired with code review, where an RSE will go through a piece of software with researchers and discuss its aims and implementation. Code review is commonplace in industry, but rarer in academia. Together, the use of code review and short projects will give the network a good idea of what software is needed and used by the community, targeting projects where they are most needed and encouraging reuse of software between groups.

As well as improving software directly, I will also work on the data front. To do this, I will develop tools to help overcome the friction and effort needed for researchers to adopt FAIR data practices. These tools will add metadata output to software, capturing important information like what version of what code created the output. This metadata can then be used to automate uploading the output to a database. I will work with the plasma science and data communities to develop what this metadata will look like, while the national network will implement these tools across the plasma science software ecosystem.

Publications

10 25 50
publication icon
Parker J (2022) Parallel tridiagonal matrix inversion with a hybrid multigrid-Thomas algorithm method in Journal of Computational and Applied Mathematics

publication icon
Jovanovic A (2023) Introduction and verification of FEDM, an open-source FEniCS-based discharge modelling code in Plasma Sources Science and Technology

 
Description EPSRC Evaluation Review of the UK National Tier-2 High Performance Computing Services Panel 1
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description EPSRC Evaluation Review of the UK National Tier-2 High Performance Computing Services Panel 2
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description Plasma Physics HEC Consortia
Amount £284,554 (GBP)
Funding ID EP/X035336/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2023 
End 12/2026
 
Title Introduction and verification of FEDM, an open-source FEniCS-based discharge modelling code - dataset 
Description The dataset contains the data presented in the paper introducing the FEDM (Finite Element Discharge Modelling) code. The FEDM code was developed using the open-source computing platform FEniCS (https://fenicsproject.org). Building on FEniCS, the FEDM code utilises the finite element method to solve partial differential equations. It extends FEniCS with features that allow the automated implementation and numerical solution of fully-coupled fluid-Poisson models, including an arbitrary number of particle balance equations. The code is verified using the method of exact solutions and benchmarking. The physically based examples of a time-of-flight experiment, a positive streamer discharge in atmospheric-pressure air and a low-pressure glow discharge in argon are used as rigorous test cases for the developed modelling code and to illustrate its capabilities. The performance of the code is compared to the commercial software package COMSOL Multiphysics®, and a comparable parallel speed-up is obtained. It is shown that the iterative solver implemented by FEDM performs particularly well on high-performance compute clusters. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
URL https://www.inptdat.de/node/668
 
Title 2D Dynamic Time Warping (DTW) algorithm for python 
Description DTW is a dynamic time warping library designed for tokamak velocimetry measurements. The algorithm works by distorting image 2 into image 1, then using the distorted coordinates to compute the displacements required to distort image 1 into image 2, given at the positions in image 1. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
URL https://zenodo.org/record/7649021
 
Title ACT MAST-U charge exchange diagnostics 
Description The Mega Ampere Spherical Tokamak Upgrade (MAST-U) project generates a lot of data. After each 'shot', the raw data from sensors within the tokamak is made available to researchers via the Universal Data Access (UDA) system, with each signal represented by a three letter code. For example, 'RCC' is the signal for Celeste-3, which measures emission spectra from impurities in the plasma. MAST-U data is processed further using a scheduler system that automatically processes raw data into more useful diagnostics once those signals become available. This processed data is itself made available via UDA as a three letter signal code, and this may, in turn, be used to generate higher level diagnostics via the scheduler. ACT is the signal representing Charge Exchange Recombination Spectroscopy (CXRS) data, which is a technique used to measure properties such as the temperature and density of impurity ions within a plasma. This diagnostic is generated by combining raw data from the Celeste-3 camera with data from the neutral-beam injection system. A newer signal, ACU, represents similar data gleaned from the upgraded Celeste-4 sensor package, which features two cameras as opposed to one, both of which operate on variable sight lines. 
Type Of Technology Software 
Year Produced 2023 
Impact More thorough testing of the package. Improvements to internal UKAEA cookiecutter template 
 
Title BOUT++ v4.4.2 
Description BOUT++ is a framework for writing fluid and plasma simulations in curvilinear geometry. It is intended to be quite modular, with a variety of numerical methods and time-integration solvers available. BOUT++ is primarily designed and tested with reduced plasma fluid models in mind, but it can evolve any number of equations, with equations appearing in a readable form. 
Type Of Technology Software 
Year Produced 2022 
Impact This release fixed several important bugs 
URL https://zenodo.org/record/1423212
 
Title BOUT++ v4.4.2 
Description BOUT++ is a framework for writing fluid and plasma simulations in curvilinear geometry. It is intended to be quite modular, with a variety of numerical methods and time-integration solvers available. BOUT++ is primarily designed and tested with reduced plasma fluid models in mind, but it can evolve any number of equations, with equations appearing in a readable form. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact BOUT++ is an important plasma modelling tool for many groups worldwide, including University of York (UK), CCFE (UK), LLNL (USA), DCU (ROI), DTU (DK). 
URL https://zenodo.org/record/6325664
 
Title C-bowman/inference-tools: 0.6.2 release 
Description Fixed an import in inference.__init__ which was causing issues for python 3.6 and 3.7 in some cases. 
Type Of Technology Software 
Year Produced 2021 
Impact I modernised the build system, automating a lot of the packaging through CI and Github Actions, as well as expanding the test suite to improve coverage and specificity 
URL https://zenodo.org/record/5718795
 
Title EPOC++ 
Description EPOC++ is a rewrite of the world leading EPOCH plasma PIC code in C++ in order to take advantage of performance portable frameworks for exascale. Our work on this package consisted of setting up CI, testing, linting, and various other "dev ops" workflows. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact A robust development workflow was implemented, helping to catch bugs before they were merged. 
URL https://warwick-plasma.github.io/EPOCpp/
 
Title EPOCH Containers 
Description licence GPLv3 Epoch is a particle-in-cell (PIC) code widely used within plasma physics, particularly in the regime of laser-plasma interactions. PIC codes aim to self-consistently solve Maxwell's equations in the presence of a large number of charged particles, many of which travel at relativistic velocities. Epoch and similar codes are often used to provide insight into the physics of matter interacting with extremely intense radiation, such as the conditions observed in inertial-confinement fusion (ICF) experiments or many astrophysical phenomena. Epoch is written in Fortran and is designed to be run on HPC systems. As is typical for such software, there is a steep learning curve for new users, particularly those who aren't experienced working from the Linux command line. During previous iterations of a summer school masterclass at the York Plasma Institute, the set-up process for Epoch was found to be a significant barrier-to-entry for some students, which limited the amount of work they could complete during their short research projects. The aim of this project was to see if containerising Epoch could help to get new users running simulations faster. Containers are similar to virtual machines, in that they enable the packaging and distribution of a complete computational workflow all the way down to the operating system level. The key difference is that virtual machines pass all OS-level commands through a hardware emulation layer, while containers are able to interface directly with a target machine's hardware while restricting all activity to a protected memory space. An oft-cited benefit to containers is that researchers can distribute complete workflows with all dependencies already included, which makes it much easier for others to replicate and build upon their results. The ability to share pre-built software and a full stack of supporting libraries all the way down to operating system also makes it easier to share complex software. However, containerising HPC codes isn't a particularly straightforward process. Docker is the dominant containerisation platform by a huge margin, but the Docker engine - which is responsible for managing container images and memory volumes on a user's system - runs container processes with root privileges. As HPC system administrators take a particularly dim view of their users having unrestricted root access, Docker is unusable on HPC systems. For this reason, HPC codes must be containerised using a tool such as Singularity, which is specifically designed to be compatible with HPC systems and tools such as MPI. We chose a two-stage approach to containerise Epoch. First, a Docker container was built containing multiple pre-compiled Epoch executables: one for each dimensionality (1D, 2D, 3D) using default compiler flags, and one more for each dimensionality with quantum electrodynamic effects switched on. The advantage of creating a Docker container first is that it can be easily tested locally, and there are many free tools available for automatically building and publishing Docker containers via the GitHub Container Repository (GHCR). This Docker container image was then used as a base to create a Singularity image, which was hosted using the Sylabs Container Services. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact One round of summer school students have now had experience running Epoch via containers, and we have received excellent feedback 
 
Title Finite Element Discharge Modelling (FEDM) 
Description The Finite Element Discharge Modelling code (FEDM) is a collection of functions to assist in the simulation of electric discharges using finite element methods, created by Aleksandar Jovanovic at the Leibniz Institute for Plasma Science and Technology. Utilising the FEniCS finite element library and its Python API Dolfin, FEDM is able to simulate a large number of particle species and the reactions that may occur between them during an electrical discharge. These systems can involve a large number of source terms, some of which may be stiff in nature, and coding these by hand can be very laborious and error-prone. FEDM aims to simplify the development of these models. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact The Finite Element Discharge Modelling code (FEDM) is a collection of functions to assist in the simulation of electric discharges using finite element methods, created by Aleksandar Jovanovic at the Leibniz Institute for Plasma Science and Technology. Utilising the FEniCS finite element library and its Python API Dolfin, FEDM is able to simulate a large number of particle species and the reactions that may occur between them during an electrical discharge. These systems can involve a large number of source terms, some of which may be stiff in nature, and coding these by hand can be very laborious and error-prone. FEDM aims to simplify the development of these models. 
URL https://zenodo.org/record/3839712
 
Title Ford 
Description FORD, standing for FORtran Documenter, is an automatic documentation generator for modern Fortran programs. As you may know, "to ford" refers to crossing a river (or other body of water). It does not, in this context, refer to any company or individual associated with cars. FORD was written due to Doxygen's poor handling of Fortran and the lack of comparable alternatives. ROBODoc can't actually extract any information from the source code and just about any other automatic documentation software was either proprietary, didn't work very well for Fortran, or was limited in terms of how it produced its output. The goal of FORD is to be able to reliably produce documentation for modern Fortran software which is informative and nice to look at. The documentation should be easy to write and non-obtrusive within the code. While it will never be as feature-rich as Doxygen, hopefully FORD will be able to provide a good alternative for documenting Fortran projects. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact FORD is an important tool in the Fortran community, and unfortunately, due to lack of time from the existing maintainers, was starting to languish. As part of my PlasmaFAIR project, I took over maintainership so that the FORD could continue to meet the needs of its users. I was able to review, merge, and fix the backlog of pull requests and bugs, add a fairly comprehensive test suite, modernise the build system and packaging, culminating the first new release of FORD in over two years. 
URL https://zenodo.org/record/1422472
 
Title FreeQDSK 
Description FreeQDSK is a Python library for reading/writing EQDSK files. These file formats are widely used within tokamak plasma research, but each code that uses them tends to include its own reader/writer functionality, and the availability of reliable open-source documentation is lacking. We hope that FreeQDSK can provide an easier way for plasma scientists to use these files in their own work. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact FreeQDSK is already being used in more than half a dozen different Python packages 
URL https://github.com/freegs-plasma/FreeQDSK
 
Title GS2 
Description GS2 is a physics application, developed to study low-frequency turbulence in magnetized plasma. It is typically used to assess the microstability of plasmas produced in the laboratory and to calculate key properties of the turbulence which results from instabilities. It is also used to simulate turbulence in plasmas which occur in nature, such as in astrophysical and magnetospheric systems. 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
Impact GS2 is a vital tool for a large fraction of the simulations performed under this project, as well as work at labs across the world, including the USA (PPPL, University of Maryland), Japan (NIFS, SOKENDAI), and the UK (CCFE, University of Oxford, University of York). 
URL https://gyrokinetics.gitlab.io/gs2/
 
Title GS2 v8.1.2 
Description GS2 is a physics application, developed to study low-frequency turbulence in magnetized plasma. It is typically used to assess the microstability of plasmas produced in the laboratory and to calculate key properties of the turbulence which results from instabilities. It is also used to simulate turbulence in plasmas which occur in nature, such as in astrophysical and magnetospheric systems. 
Type Of Technology Software 
Year Produced 2022 
Impact This release fixed several important bugs 
URL https://zenodo.org/record/2551066
 
Title SCENE 
Description SCENE is a tokamak equilibrium solver which can generate equilibria in a variety of file formats 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact I modernised the build system, added some testing and CI, and made it open source 
 
Title Scotty Beam Tracing 
Description Beam tracing for tokamak DBS diagnostics 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact Scotty now has a comprehensive test suite, improved input and output, and is available to install via PyPI 
 
Title The Paramak: automated parametric geometry construction for fusion reactor designs 
Description What's Changed Refactoring hollow unmeshable components with overlapping edges by @shimwell in https://github.com/fusion-energy/paramak/pull/154 Adding part properties for brep to h5m by @shimwell in https://github.com/fusion-energy/paramak/pull/155 Setup cfg by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/156 Adding plasmafair to citation file by @shimwell in https://github.com/fusion-energy/paramak/pull/160 Added twine check line to python-publish by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/163 Ignore init.py imports with flake8 by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/162 avoiding builds tools: method get version number from CI by @shimwell in https://github.com/fusion-energy/paramak/pull/159 Trigger CI on certain paths rather than paths-ignore by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/164 Pass tag as paramak version when building docker on release by @ZedThree in https://github.com/fusion-energy/paramak/pull/166 Docs fix by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/168 Modernised setup progress by @shimwell in https://github.com/fusion-energy/paramak/pull/165 Added patch for Workplane.extrude to permit CQ 2.1/2.2 compatibility by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/167 typos and unused imports by @shimwell in https://github.com/fusion-energy/paramak/pull/170 Revert "typos and unused imports" by @shimwell in https://github.com/fusion-energy/paramak/pull/172 Minor fixes by @shimwell in https://github.com/fusion-energy/paramak/pull/173 Merge pull request #173 from fusion-energy/minor_fixes by @shimwell in https://github.com/fusion-energy/paramak/pull/174 Testing cleanup by @LiamPattinson in https://github.com/fusion-energy/paramak/pull/171 Updates for cq 2.2 by @shimwell in https://github.com/fusion-energy/paramak/pull/176 Improved testing layout by @shimwell in https://github.com/fusion-energy/paramak/pull/178 pinned pyparsing ~= 2.4.7 by @shimwell in https://github.com/fusion-energy/paramak/pull/179 Adding export h5m using brep gmsh method by @shimwell in https://github.com/fusion-energy/paramak/pull/157 Adding export dagmc h5m method by @shimwell in https://github.com/fusion-energy/paramak/pull/180 Adding tests for cq master and automate conda build by @shimwell in https://github.com/fusion-energy/paramak/pull/182 Adding h5m export option, automated conda packaging by @shimwell in https://github.com/fusion-energy/paramak/pull/186 [skip ci] corrected conda channel order by @shimwell in https://github.com/fusion-energy/paramak/pull/187 targetting correct config file in anaconda dev build by @shimwell in https://github.com/fusion-energy/paramak/pull/188 removed packages that are brought in already by @shimwell in https://github.com/fusion-energy/paramak/pull/190 Updating conda Dev action by @shimwell in https://github.com/fusion-energy/paramak/pull/191 improving conda builds by @shimwell in https://github.com/fusion-energy/paramak/pull/192 Develop by @shimwell in https://github.com/fusion-energy/paramak/pull/193 New Contributors @ZedThree made their first contribution in https://github.com/fusion-energy/paramak/pull/166 Full Changelog: https://github.com/fusion-energy/paramak/compare/v0.6.5...0.7.0 
Type Of Technology Software 
Year Produced 2021 
Impact We improved the build and testing systems 
URL https://zenodo.org/record/6245845
 
Title nc-complex 
Description nc-complex is a lightweight, drop-in extension for netCDF that handles reading and writing complex numbers. Currently there are C and C++ APIs, and it has been integrated into netcdf4-python. A Fortran API is also planned. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact nc-complex has been integrated into netcdf4-python, enabling out-of-the-box support for complex numbers 
URL https://github.com/PlasmaFAIR/nc-complex
 
Title neasy-f 
Description neasy-f is a short-and-sweet wrapper for netCDF-Fortran. Rather than attempting to be a feature-complete replacement, neasy-f provides wrappers for some common operations, trying to keep simple things simple. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact neasy-f is a new wrapper for the netCDF Fortran API. I have already used it successfully in two separate pieces of software to greatly simplify the existing uses of netCDF and so improve the maintainability of the codes. For example, using neasy-f in GS2 allowed me to remove a previous netCDF wrapper library, and consolidate a lot of code spread across two separate modules, resulting in the net removal of more than 15,000 lines of code. 
 
Title openmc-plasma-source 
Description This python-based package offers a collection of pre-built OpenMC neutron sources for fusion applications. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact We modernised the build system, enforced constraints on various physical parameters, and greatly improved the tests 
URL https://github.com/fusion-energy/openmc-plasma-source/releases/tag/v0.2.7
 
Title pyrokinetics 
Description This project aims to standardise gyrokinetic analysis by providing a single interface for reading and writing input and output files from different gyrokinetic codes, normalising to a common standard, and performing standard analysis methods. A general pyro object can be loaded either from simulation/experimental data or from an existing gyrokinetics file. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact We have modernised the build system, expanded the testing, added CI and automated packaging, and have begun significant refactoring in order to improve sustainability 
 
Title tokamesh v0.3.0 
Description Tokamesh is a Python package which provides tools for constructing meshes and geometry matrices used in tomographic inversion problems in toroidal fusion-energy devices such as MAST-U. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact I developed a Cython interface for a bundled C library in this Python package, making the whole thing more portable 
URL https://github.com/C-bowman/tokamesh/releases/tag/0.3.0
 
Description Guest on podcast about testing Python code 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Guest on major RSE podcast, discussed how to test Python code
Year(s) Of Engagement Activity 2022
URL https://codeforthought.buzzsprout.com/1326658/11830579-bytesized-testing-your-python-code
 
Description Podcast interview of RSE Fellows cohort 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Interview with the recent RSE Fellows cohort on a major RSE podcast
Year(s) Of Engagement Activity 2022
URL https://codeforthought.buzzsprout.com/1326658/9859960-join-the-fellowship
 
Description Satellite conference at IOP Plasma Physics 2022, "Community Code and Data" 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Software is vital to modern research and nowhere more so than in plasma science. From first principles simulations of tokamak plasmas, to analysis of experimental atmospheric plasmas, our community makes heavy use of software. We also sometimes spend large amounts of computational resources -- and therefore carbon -- generating data that is used by a single group maybe once or twice. This satellite workshop explored various ways we can work together to make better use, and reuse, of our resources. What tools do we as a community need to improve our research? How can we enable secondary uses of the data we generate?

This workshop was hosted at the national IOP Plasma Physics conference in order to maximise impact, and was held on the free afternoon. More than 30 people attended the workshop, just under half the total attendees of the main conference. We had four speakers, two of which were international and one from industry. There was good discussion with lots of audience participation around the issues raised. I also spoke about my Fellowship work, and this lead to some fruitful conversations and further work.
Year(s) Of Engagement Activity 2022
 
Description Talk at ARCHER2 Celebration of Science 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact ARCHER2 community, including PIs, ECRs, and other users, gathered to hear about key findings across the whole HPC community in the UK
Year(s) Of Engagement Activity 2024