Building Capability and Support in Research Software

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

"Software is the most prevalent of all the instruments used in modern science" [Goble 2014]. Scientific software is not just widely used [SSI 2014] but also widely developed. Yet much of it is developed by researchers who have little understanding of even the basics of modern software development with the knock-on effects to their productivity, and the reliability, readability and reproducibility of their software [Nature Biotechnology]. Many are long-tail researchers working in small groups - even Big Science operations like the SKA are operationally undertaken by individuals collectively.

Technological development in software is more like a cliff-face than a ladder - there are many routes to the top, to a solution. Further, the cliff face is dynamic - constantly and quickly changing as new technologies emerge and decline. Determining which technologies to deploy and how best to deploy them is in itself a specialist domain, with many features of traditional research.

Researchers need empowerment and training to give them confidence with the available equipment and the challenges they face. This role, akin to that of an Alpine guide, involves support, guidance, and load carrying. When optimally performed it results in a researcher who knows what challenges they can attack alone, and where they need appropriate support. Guides can help decide whether to exploit well-trodden paths or explore new possibilities as they navigate through this dynamic environment.

These guides are highly trained, technology-centric, research-aware individuals who have a curiosity driven nature dedicated to supporting researchers by forging a research software support career. Such Research Software Engineers (RSEs) guide researchers through the technological landscape and form a human interface between scientist and computer. A well-functioning RSE group will not just add to an organisation's effectiveness, it will have a multiplicative effect since it will make every individual researcher more effective. It has the potential to improve the quality of research done across all University departments and faculties.

My work plan provides a bottom-up approach to providing RSE services that is distinctive from yet complements the top-down approach provided by the EPRSC-funded Software Sustainability Institute.

The outcomes of this fellowship will be:

Local and National RSE Capability: A RSE Group at Sheffield as a credible roadmap for others pump-priming a UK national research software capability; and a national Continuing Professional Development programme for RSEs.

Scalable software support methods: A scalable approach based on "nudging", to providing research software support for scientific software efficiency, sustainability and reproducibility, with quality-guidelines for research software and for researchers on how best to incorporate research software engineering support within their grant proposals.

HPC for long-tail researchers: 'HPC-software ramps' and a pathway for standardised integration of HPC resources into Desktop Applications fit for modern scientific computing; a network of HPC-centric RSEs based around shared resources; and a portfolio of new research software courses developed with partners.

Communication and public understanding: A communication campaign to raise the profile of research software exploiting high profile social media and online resources, establishing an informal forum for research software debate.

References

[Goble 2014] Goble, C. "Better Software, Better Research". IEEE Internet Computing 18(5): 4-8 (2014)

[SSI 2014] Hettrick, S. "It's impossible to conduct research without software, say 7 out of 10 UK researchers" http://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers (2014)

[Nature 2015] Editorial "Rule rewrite aims to clean up scientific software", Nature Biotechnology 520(7547) April 2015

Planned Impact

The proposed program of work will bridge between academic researchers and commercial and non-commercial providers of research software. It will deliver improvements in the availability, usability and awareness of research software and best practice in its use and development. The impacts will be felt both at the University of Sheffield and across the national research software landscape. Through collaboration with commercial and non-commercial partners and delivery of ideas both in Sheffield and across the UK, I expect to provide a framework for research software support that will be used as an international exemplar of best practice.

My workplan provides a bottom-up approach to providing RSE services that is distinctive from yet complements the top-down approach provided by the EPRSC-funded Software Sustainability Institute.

Academic researchers and RSEs at The University of Sheffield will benefit from the formation of a RSE group whose remit is to provide scalable RSE support to the entire University. This group will directly improve academic software, provide a wide range of training opportunities, advise on best practice and directly contribute to research grants. The RSE profession will benefit from the creation of this high-profile node in the national RSE network.

Academic researchers and RSEs around the UK will benefit from the formation of a RSE brokerage. Seeded initially by effort from The Software Sustainability Institute, Manchester, UCL and Sheffield (See letters of support), this will form a national RSE capability. This capability will be enhanced through the development of a Continuing Professional Development (CPD) programme.

Commercial developers of scientific software will contribute to and benefit from the development of research software guidelines aimed at easing integration into commercial products. The RSE CPD programme will take contributions from commercial partners (see letters of support) which will assist them in developing closer links with academia and disseminating their latest products and features.

Academic researchers and RSEs around the world will benefit from the open course materials developed in collaboration with high-profile project partners including Microsoft Research, the N8 research partnership of northern universities and Software Sustainability Institute (see letters of support). Initial outputs will include 'HPC Carpentry' and 'Software and Data Carpentry for Windows'.

All users and developers of research software will benefit from the wide range of articles published on social media, blogs and websites. In particular, the 'long tail' of self-taught research software developers will have access to a large quantity of quality advice and tutorial materials developed as a result of RSE consultancy activities. These will be disseminated on my popular (500,00 annual visitors) website, www.walkingrandomly.com along with the websites of project partners.

The improved research made possible by the improved software landscape will benefit the wider public. Previous work includes research in areas such as land-mine detection, radio astronomy, medical image processing, computational finance, numerical algorithms, pure mathematics and functional genomics. It is expected that the support provided by this fellowship will increase the breadth and depth of such contributions.

Publications

10 25 50

publication icon
Ihle M (2017) Striving for transparent and credible research: practical guidelines for behavioral ecologists. in Behavioral ecology : official journal of the International Society for Behavioral Ecology

publication icon
Richardson C (2018) Research Software Engineer: A New Career Track? in Siam News

 
Description Best Practice for Code Archiving in Ecology
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact At the 2016 British Ecological Society annual meeting in Liverpool the Methods in Ecology and Evolution(MEE) team held a half-day workshop on 'Best Practice for Archiving Code'. The idea for the workshop came from a meeting several members of the journal editorial board held earlier in 2016 to discuss the complex issues around publishing code. MEE, like all the BES journals, supports the principles of open science and they want to make sure that code published in their journal is readily available to readers and adheres to key principles of quality, usability, accessibility and functionality. Coding is becoming a more and more important skill for ecologists to have, but often training is not readily available and ecologists tend to be self-taught. MEE wants to introduce guidance for authors so that published code is as useful as possible to their readership. However, they do not want potential authors to feel that they lack the coding prowess to adhere to our guidelines and so be put off from publishing with us. They therefore designed a workshop with two aims: * To give attendees training in using code in their research and getting it ready to publish and * To consult participants on the usefulness of our proposed code guidelines for authors. The training section of the workshop took the form of three practicals led by experienced experts. On the day, the sessions were guided exercises but each has been designed as a self-learning module so for people who were not there, but are interested in learning more, they can access all the practicals at https://github.com/BES2016Workshop Laura Graham, an ecologist with a background as a data analyst, taught delegates how to write reproducible code in R, giving them a reproducible workflow that can easily be revisited at a later date. Tamora James, an ecologist with a programming background, led a session on version control using GitHub. Version control is the process of tracking changes to documents and code over time, giving delegates a flexible way to manage and share their code. Finally, Mike Croucher, a software engineer with experience providing training to ecologists, led a session on code publication and citation. He introduced the group to the Software Citation Principles as developed by a Force 11 working group and to Zenodo, a service developed by CERN that allows publishing research objects such as scripts, data and software packages in a way that satisfies the software citation principles. The second section of the workshop was a consultation with delegates on MEE's draft code guidelines. Led by MEE Executive Editor Rob Freckleton, this raised some interesting questions from the audience, including: * How should we deal with new releases of software after publication of papers? * To what extent should we expect reviewers to peer review code? * Where should we draw the line between good practice, which is practical to adhere to for the community, and best practice - a gold standard which might be practically unattainable.
URL http://www.britishecologicalsociety.org/workshop-best-practice-code-archiving/
 
Description Out of Our Minds - Research Leadership Award
Amount £935,900 (GBP)
Organisation The Leverhulme Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 02/2017 
End 02/2022
 
Title NOW corpus compilation and data extraction 
Description NOW corpus compilation and data extraction for Petar Millin (Dept. of Journalism studies). Built and populated large (150 GB) sqlite3 database from supplied zip files of the News on the Web corpus. Also extracted specific data from the database. Scripts to reproduce database build on a cluster openly available. 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? No  
Impact Database to be ported to a live database format and made accessible to faculty members 
URL http://annakrystalli.me/news-scrape/index.html
 
Description Research Software Engineering Sheffield 
Organisation University of Sheffield
Country United Kingdom 
Sector Academic/University 
PI Contribution Research Software Engineering Sheffield is a group dedicated to the improvement of research through the improvement of the software that underlies it. It was created as a direct result of this fellowship award. By offering training, consultancy and facilities, we work with researchers to make their software more robust, easier to use and more efficient. We also work with the local, national and international RSE communities to raise the profile and quality of research software.
Collaborator Contribution The University of Sheffield has embraced our values and have engaged with us at every level. We have the support of researchers, senior management and professional services.
Impact Collaborated with lecturers and University teachers to improve the quality of programming, High performance and cloud computing courses across multiple disciples Provided a suite of training courses to researchers at all levels in areas such as programming, software engineering and high performance computing Accelerated computer programs from a wide range of disciplines to enable researchers to get their results up to 3 orders of magnitude more quickly Engaged with High Performance Computing and cloud computing providers to assist researchers in making use of such facilities Working with Sheffield Human Resources to produce a career pathway for Research Software Engineers
Start Year 2016
 
Description RodentDataAnalytics 
Organisation University of Sheffield
Country United Kingdom 
Sector Academic/University 
PI Contribution Development of the RodentDataAnalytics software, mentorship of the lead developer, project management and consultancy. We helped make the code faster, more robust and easier to use and set out a programme of work for the current lead developer. We also financed a trip for the lead developer to visit users of the software in Poland.
Collaborator Contribution The science and methodology behind the software was developed the academic partners. They also provided the day to day programming effort.
Impact Software at https://github.com/RodentDataAnalytics/mwm-ml-gen
Start Year 2016
 
Description Software and Data Carpentry 
Organisation Software Carpentry Foundation
Country United States 
Sector Charity/Non Profit 
PI Contribution Under the auspices of 'Research Software Engineering Sheffield', we have entered into a partnership with the Software Carpentry Foundation (SCF). https://software-carpentry.org/scf/ The SCF is an organisation dedicated to providing training to researchers on the basics of good practice in software development and data management. We have joined as a 'Silver' member which allows us to train 6 instructors a year and hold an unlimited number of Software Carpentry Workshops per year. We also have a seat on the SCF Advisory Board which allows us to advise on future developments. We will be co-ordinating Software Carpentry activities at The University of Sheffield in collaboration with the University library and IT department.
Collaborator Contribution The Software Carpentry Foundation provide the lessons and infrastructure to allow Software Carpentry events to take place.
Impact Training of a group of instructors at University of Sheffield.
Start Year 2016
 
Title Applying Research Software Engineering methodologies to documenting and supporting a University HPC system 
Description Using modern and open RSE technologies such as Sphinx, GitHub and ReadTheDocs we have developed a collaborative documentation system for Sheffield's local High Performance Computing System. The new site allows multiple stakeholders to collaborate on the development of the system including local HPC support, the Research Software Engineering team and the research community. The site includes all of our install scripts, documentation and tutorials and is released under a permissive license. As such, other HPC centers can build on our work. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2016 
Impact The documentation system for Cirrus, the HPC system at University of Edinburgh uses similar technologies and they cite us as the inspiration for their work at http://cirrus.readthedocs.io/en/latest/. 
URL https://github.com/rcgsheffield/sheffield_hpc
 
Title Custom HPC environments for training purposes 
Description A set of scripts to customise an Alces Flight instance so that the resulting virtual HPC cluster can be used to support a HPC training session. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Alces Flight, the organisation that develops the software that these scripts customises, wrote about this development at https://medium.com/@alcesflight/desperately-seeking-supercomputer-bdf72a10f1a8 
URL https://medium.com/@alcesflight/desperately-seeking-supercomputer-bdf72a10f1a8
 
Title RodentDataAnalytics/mwm-ml-gen 
Description mwm-ml-gen (Morris Water Maze - Machine Learning - Generalized) is a complete set of tools for analysis and classification of rodent trajectories inside the Morris Water Maze. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact This software was the basis for the paper 'A generalised framework for detailed classification of swimming paths inside the Morris Water Maze' -- currently available on ArXiV https://arxiv.org/abs/1711.07446 
URL https://github.com/RodentDataAnalytics/mwm-ml-gen
 
Title Statistical Power Analyser web app 
Description Statistical Power analyser web app for Tom Stafford (Dept. of Psychology), working demo complete): In development as part of dataviz.shef showcase outreach. Waiting for results from Tom's simulations to finalise. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Impact Will form interactive companion output to paper on simulation. 
URL https://annakrystalli.shinyapps.io/xspl_power_analyser
 
Title pkgreviewr 
Description I developed an R package to help with the rOpenSci package review process. rOpenSci is a non-profit initiative that aims to make scientific data retrieval reproducible. The package has now migrated to ropenscilabs on github with a view to be integrated into their review process. The journal Methods in Ecology and Evolution have also expressed interest in integrating it into their review process. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact The package has now migrated to ropenscilabs on github with a view to be integrated into their review process. The journal Methods in Ecology and Evolution have also expressed interest in integrating it into their review process. 
URL https://github.com/ropenscilabs/pkgreviewr
 
Description Organised first Sheffield RSE Software Carpentry event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Organised and was a helper for first Sheffield RSE Software Carpentry event covering UNIX, git, python and SQL for ~30 Faculty of Engineering postgraduate students.
Year(s) Of Engagement Activity 2017
URL http://rse.shef.ac.uk/2017-08-16-sheffield/
 
Description 2nd CoDiMa Training School in Computational Discrete Mathematics (International Centre for Mathematical Sciences) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact PhD students in algebra & number theory from 12 UK Universities learned about good software engineering practices via a Software Carpentry boot camp that was centered around the us of the GAP computer algebra system and Jupyter notebooks. I gave a talk about good software practices called 'Is your Research Software Correct'?
Year(s) Of Engagement Activity 2016
URL https://storify.com/CIRCA_StAndrews/codima-2016
 
Description A Bite Size Guide to Research in the 21st Century 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact This is a series of events organised by Sheffield's Social Sciences Doctoral Training Centre (DTC). It was targeted at members of Sheffield's School of Health and Related Research (ScHARR). I took part by delivering a talk on good practices in research software development which led to audience members requesting information on further training opportunities to allow them to apply the concepts discussed to their future work.
Year(s) Of Engagement Activity 2016
URL https://www.sheffield.ac.uk/social-sciences-dtc/news-story/research-in-the-21st-century-1.577080
 
Description Amazon AWS Immersion day 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact We arranged a workshop with HPC-centric cloud computing experts from Amazon and partners on how to make effective use of AWS facilities including the Alces Flight HPC application.
Year(s) Of Engagement Activity 2017
 
Description Archer Champions 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The aim of the Archer Champions network is to:

- Provide a support network between staff members whose role involves advising users on access to local, regional and national HPC resources.
- Help to promote a coherent access structure to HPC resources across the UK, with coordination between tiers.
- Support and promote activities designed to provide career development to research software engineers seeking a career in HPC.
- Support and promote activities to broaden the UK HPC user base to new disciplines and communities.
- Participate in ARCHER Champions Workshops communities.
- Promote common training material and techniques.
Year(s) Of Engagement Activity 2016,2017
URL http://www.archer.ac.uk/community/champions/
 
Description BBSRC: Workshop to discuss Non-Faculty Researchers Careers and Recognition 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Towards the end of 2015 BBSRC circulated a survey requesting views on careers and skills from staff that we referred to as 'non-faculty researchers'. This group included technical staff, researchers operating instruments, facility managers, statisticians, bioinformaticians, technology developers and many others. They used the term 'non-faculty researchers' as there didn't appear to us to be a single job title or defining term to use, and the survey confirmed this, with a plethora of job titles returned.

They received over 800 replies, showing the size and diversity of this group and the concerns and questions they had about their careers and roles. Over the last year they looked closely at the survey results, discussed these with colleagues within the Research Councils and beyond and considered ways to address some of these issues (see their recent letter in Nature: rdcu.be/nDjB (PDF)).

This invitation-only event brought together practitioners,funders and policy-makers to discuss the issues involved and find a way forward.

I was invited to give a talk to this audience about the Research Software Engineering community as a successful case study in how a subset of non-faculty researchers improved their situation by developing a coherent identity within UK academia.
Year(s) Of Engagement Activity 2017
URL http://www.bbsrc.ac.uk/news/events/2017/1703-workshop-non-faculty-researcher-careers/
 
Description Building Communities - RSE working group 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Sparked by my attendance of the 2nd Research Software engineering conference, I was invited by the European Molecular Biology Lab Bio-IT community to a working group of European RSEs to discuss what strategies for effective computation research community and capacity building. Participants included Toby Hodges, Malvika Sharan, and Georg Zeller shared experiences from EMBL Bio-IT, with a brief appearance (via video call) from Aidan Budd to provide details of the early days; Michael Meinel representing the DLR; Stephan Janosch brought experiences from a number of initiatives, spanning from his local institution Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG, via the Dresden Concept, a research alliance among 24 Dresden based research institutions, to the national de-RSE network of German RSEs; Stefan Helfrich, an excellent impromptu addition resulting from Toby's involvement in the Carpentries' mentorship groups program, represented the Network of European Bioimage Analysts (NEUBIAS) as well as German BioImaging (GerBI). I represented Sheffield RSE and was sponsored to attend by the Software Sustainability Institute. Outcomes include a soon to be published blog on the SSI blog and the establishment of an RSE community call, initially focusing on aspects of community building.
Year(s) Of Engagement Activity 2018
URL https://github.com/tobyhodges/building-communities/blob/master/blogpost-AK.md
 
Description Code Cafe - Informal programming and data science lessons for researchers 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact 'Code Cafe' is a new initiative that's part of the suite of activities organised by 'Research Software Engineering Sheffield'. Currently in its pilot phase, It is based around a set of open, online training materials that can be worked through at a learners own pace. We hold events in Cafe's at Sheffield where researchers can come and work through such material with the help of a group of expert facilitators.

Early events have included the open source data analysis programming language, R and an event with Wolfram Research on their Mathematica platform.
Year(s) Of Engagement Activity 2016,2017
URL http://www.walkingrandomly.com/?p=5981
 
Description Dagstuhl Perspectives Workshop 16252 - Engineering Academic Software 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This Dagstuhl Perspectives Workshop brought together activists, experts and stakeholders on the subject of high quality software produced in an academic context. Our current dependence on software across the sciences is already significant, yet there are still more opportunities to be explored and risks to be overcome. The academic context is unique in terms of its personnel, its goals of exploring the unknown and its demands on quality assurance and reproducibility.

We refer to the IEEE Internet Computing article "Better Software, Better Research" which motivated the topic. In this workshop we took the following perspective of a research team which is in either or both of the following situations:

consuming or producing software as an output of the academic process;
consuming or producing software as a component of the research methods.

Society is now in the tricky situation where several deeply established academic fields (eg physics, biology, mathematics) are shifting towards dependence on software, programming technology and software engineering methodology which are backed only by young and rapidly evolving fields of research (computer science and software engineering). Full accountability and even validity of software-based research results are now duly being challenged.

With the outputs of this interactive and productive perspectives workshop, we strive to contribute in a positive manner to the above challenges. We formulated taxonomies with definitions to clarify the domain, we co-authored concrete policy and process documents to improve the status and recognition of academic software development and academic software engineers, and finally we formulated a list of 18 concrete declarations of intent ("I will" pledges). This list was presented to the WSSSPE community in September 2016 to acquire feedback and it will be the backbone of the Dagstuhl Manifesto document we are editing. It serves to motivate change by proposing policy changes with concrete actions and instilling positive attitudes towards academic software.
Year(s) Of Engagement Activity 2016
URL http://www.dagstuhl.de/de/programm/kalender/semhp/?semnr=16252
 
Description Dataviz.shef initiative 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Dataviz.shef is an initiative to promoting and build community around data visualisation at the University of Sheffield. Visualisation has always been at the core of extracting understanding from data, but powerful, modern, open source, interactive and web-based visualisation tools have revolutionised the potential for research data impact. To help our researchers make the most of their data and take advantage of such tools, we (Research Software Engineering, the University Library and CICS) have been working on dataviz.shef, a multi-pronged initiative to provide tools, training and build a community around interactive data visualisation at TUoS. We're also collaborating with the Interactive Data Network at the University of Oxford as OxShef dataviz and will be co-organising events and co-developing open CC-By resources.
Year(s) Of Engagement Activity 2017,2018
URL http://dataviz.shef.ac.uk/
 
Description EPSRC RSE Fellows Launch event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This was the launch event for the EPSRC Research Software Engineering Fellowship, the first fellowship of its kind. Over 201 'expression of interest applications' were received and 7 fellowships were awarded nationally. This event brought together funders, policy makers, members of the scientific software industry, leading academics as well as leading RSE practitioners to discuss the new scheme and how to increase its impact in the future.

I additionally interviewed all 7 of the new fellows at http://www.walkingrandomly.com/?p=6037
Year(s) Of Engagement Activity 2016
URL https://storify.com/SoftwareSaved/rse-fellows-launch-at-the-royal-society
 
Description EPSRC Tier-2 launch event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Six high performance computing centers were formally launched in the U.K. at this event This expansion of HPC resources and access to them is being funded with £20 million from the Engineering and Physical Sciences Research Council. The EPSRC plays a somewhat similar role in the U.K. to the National Science Foundation role in the U.S.

The centers are located at the universities of Cambridge, Edinburgh, Exeter, and Oxford, Loughborough University, and University College London. I was invited to give one of the keynote talks at this event and my talk was titled 'High Performance Computing - There's plenty of room at the bottom'
Year(s) Of Engagement Activity 2017
URL http://www.walkingrandomly.com/?p=6305
 
Description EPSRC software strategy workshop: Where will research software be in 10 years' time? 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact This was an event organised by EPSRC (Their description is below). I was invited to give a keynote talk: The Rich/Poor Divide in Research Software

Computer-supported modelling and simulation is now widely recognised as the third `leg` of scientific method, alongside theory and experimentation. There are many phenomena which can be studied using only computational processes, which require the use of research software, for example the analysis of experimental data. Software is fundamental to research and according to a report by the Software Sustainability Institute 7 out of 10 researchers report their work would be impossible without it. A large amount of intellectual property, knowledge and understanding resides in software, and this is why software has such longevity: people replace their hardware, but don't dispose of their codes.

EPSRC recognised the importance of software and the need for it to be regarded as an infrastructure in its own right. EPSRC published the Software as an Infrastructure strategy in 2012 to ensure that our investments added value to the software landscape and supported the Engineering and Physical Sciences Community. Furthermore, we also published an e-infrastructure roadmap , which highlighted the importance of investing in software.

1.2 The need for a new strategy
The EPSRC Software as an Infrastructure Strategy and associated action plan was published in 2012 following a community workshop and consultation. Four years on, inevitably, the desired outcomes for the software landscape have evolved. EPSRC have also published our new Delivery Plan, which outlines how we will deliver our strategy for the period of 2016-2020, and we have a renewed vision for software.

Our vision is for Software as an Infrastructure to continue to receive support and deliver a robust, long-term programme and vision to support the research, development, management and maintenance of high-quality, robust, recognised and accredited software for the scientific communities.
With this new vision and the changes to the software landscape since our current strategy was published it is essential that we update and implement a new strategy for 2016-2020. With this in mind, we are holding the software strategy workshop to gain input from the community about how to refresh our strategy.
1.3 Aims of the software workshop

The aims of this Software workshop are to:

• Gain community feedback on key priorities in software to address in the future.
• Gain input from the community on how to ensure a long term strategy for software.
• Get advice on new objectives that require action and investment in the short, medium and long term.

The aim of this briefing is to:

• Provide some background information on the Software as an Infrastructure strategy.
• Present a summary of EPSRC's investments since 2012.
• Present EPSRC's proposed framework for a new software strategy.
• Provide further information about the workshop sessions.
• Give some background on the software landscape in the UK.
Year(s) Of Engagement Activity 2016
 
Description Future Open Science services for scientific communities panel 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited to participate on a panel discussion at the EOSCpilot/OpenAIRE-DE joint workshop "Future Open Science services for scientific communities". This free workshop addressed researchers from diverse scientific communities, as well as research administrators and libraries, to present the EOSC vision, its first phase of implementation via the EOSCpilot project, and OpenAIRE's services for Open Science. It was a forum for dialogue, serving to collect feedback from participants about their expectations for the EOSC, and their views on the challenges and opportunities its implementation will bring and will help both EOSCpilot and OpenAIRE to shape their future developments according to researchers' needs. on the needs of specific research communities with respect to the European Open Science Cloud (EOSC) initiative. The Panel discussion focused on the needs of specific research communities. Drawing on experiences as an RSE, I focused on the practicalities of the EOSC vision with respect to current research computational workflows. There's a non-trivial capacity building requirement involved in implementing modern scientific workflows, compatible with the EOSC vision. Researchers will need help in transitioning.
Year(s) Of Engagement Activity 2017
URL https://eoscpilot.eu/events/eoscpilotopenaire-de-joint-workshop-future-open-science-services-scienti...
 
Description HPC and Cloud computing seminar with Red Oak Consulting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I was invited to give a talk about Research Software Engineers and what they need from High Performance Computing resources. The audience including HPC service providers, cloud vendors, researchers and funders. I also took part in the subsequent discussion panel.
Year(s) Of Engagement Activity 2017
URL https://www.redoakconsulting.co.uk/hpc-and-cloud-seminar/
 
Description International Research Software Engineering Conference 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This was the first international conference dedicated to Research Software Engineering. Hosting around 200 delegates from around the world, the conference was a mixture of talks, workshops and networking opportunities with the opening plenary given by Matthew Johnson of Microsoft Research. I was workshop co-chair with workshops coming from both academia and industry giving delegates many learning opportunities in a range of cutting-edge software technologies.
Year(s) Of Engagement Activity 2016
URL http://www.walkingrandomly.com/?p=6236
 
Description Introduction to Moden Fortran (Ian Bush) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Ian Bush is one of EPSRC's RSE fellows and is a national expert on the Fortran programming language. Despite its age, Fortran is still THE programming language of choice for HPC applications and so Fortran skills are in demand from the research community. This event was a collaboration between Research Software Engineering Sheffield and Ian Bush and aimed to ensure that people new to Fortran learned modern good practice from the outset.
Year(s) Of Engagement Activity 2017
 
Description Introduction to parallel computing with MPI 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact This 2 day workshop was arranged in collaboration with NAG, The Numerical Algorithms Group and taught postgraduate students and researchers how to use the MPI protocol for parallel computing.
Year(s) Of Engagement Activity 2017
URL https://github.com/mikecroucher/CloudCluster
 
Description Invited to attend Scifoo and ran session on capacity building for open reproducible research 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact SciFoo is invite only and includes researchers, technologists, writers, educators, artists, policy makers, investors and other thought leaders for a weekend of unbridled discussion, demonstration and debate.

It is an informal conference format pioneered by O'Reilly Media, a leading book publisher and event organiser in the field of information technology. There is no predefined agenda, instead attendees collaboratively create one, with little if any boundaries as to what can be discussed. During the event, I ran a session on 'Open Data, Open Software, Reproducible research: everyone wants it, who's going train us?". The session was well attended considering the split across sessions (~25 attendees) and generated some interesting discussion around how to build computational literacy capacity throughout academia. I also gave Tim O'Reilly a copy of UK RSE state of the nation report.
Year(s) Of Engagement Activity 2017
URL https://www.digital-science.com/events/scifoo-camp-2017/
 
Description Keynote speech at 2nd International Research Software Engineering conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited to give one of the keynote talks at this international event which attracted over 200 people from more than a dozen countries.
Year(s) Of Engagement Activity 2017
URL https://mikecroucher.github.io/RSE_2017_keynote_presentation/
 
Description Mozilla Open Leaders Mentorship Program 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact I mentored two project on round 4 and 5 of the Mozilla Open Leaders Mentorship program. Round 4 (Fall 2017): Project Data Literacy Playground: Open Educational Resource lead by Samantha Ahern, targeted at 16-24 yr olds to help them navigate their privacy rights and the implications of sharing their data (https://medium.com/read-write-participate/data-literacy-playground-cabcf7d9dab). Round 5 (Spring 2018): GirlsGetGeeky: Mentoring a team of web development educators (Cecillia Mbugua and Stella Maris Njage) to develop offline ruby on rails training material and run a training series with 16+ yo girls and women in Kenya.
Year(s) Of Engagement Activity 2017,2018
URL https://mozilla.github.io/leadership-training/round-5/projects/#girlsgetgeeky
 
Description N8 Microsoft Azure training day at University of Sheffield 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact I arranged this workshop with Microsoft under the auspices of Research Software Engineering Sheffield. Delegates were invited from universities comprising the N8 consortium, a collaboration of the eight most research intensive Universities in the North of England: Durham, Lancaster, Leeds, Liverpool, Manchester, Newcastle, Sheffield and York.

Microsoft taught a range of short lessons on various aspects of cloud technologies. The event led to at least one research group applying for and winning a $20K grant of Azure time for use in their research. This represents a significant computational resource for their work.
Year(s) Of Engagement Activity 2016
 
Description Open Science Workshop at International Society for Behavioral Ecology 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A seminar called 'Is Your Research Software Correct?' was delivered that discusses the problems inherent in research software practices and what can be done about them. This was followed by a workshop for participants in how to use Version control within R and RStudio, commonly used data analysis tools in Ecology.
Year(s) Of Engagement Activity 2016
URL https://malikaihle.wordpress.com/openscienceworkshop/
 
Description OpenCon London Reprohack 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact OpenCon is a conference series and community bringing together leading students and early career academic professionals from across the world to learn about Open Access, Open Education, and Open Data, develop critical skills, and catalyze action toward a more open knowledge sharing system. For the 2017 OpenCon London satellite event, I co-organised and ran the day long doathon. In particular a ran a reproducibility hackathon, where participants attempted to reproduced published papers from published code and data. We then fed back to the original authors on aspects of reproducibility, transparency, documentation and reuse.
Year(s) Of Engagement Activity 2017
URL http://rse.shef.ac.uk/blog/opencon_london/
 
Description Parallel Programming with OpenMP workshop (Collaboration with NAG 2016) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact OpenMP is one of the most popular technologies used to accelerate software through the use of parallelisation. This workshop was a collaboration between Research Software Engineering Sheffield and the Numerical Algorithms Group (NAG) and targeted delegates from University of Sheffield although some places were available to academics from nearby institutions. Some of the audience members reported that they had accelerated their own code within a couple of weeks of completing the course.
Year(s) Of Engagement Activity 2016
 
Description R Data Carpentry workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Taught R for a Data Carpentry workshop organised by the University of Sheffield Library for Life Sciences for circa 30 postgraduate students
Year(s) Of Engagement Activity 2017
URL https://alfawolf140.github.io/2017-07-26-Sheffield/
 
Description R awareness session for IT support staff 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The audience was a combination of desktop support, applications support and research software engineers from The University of Manchester

The focus of the session was not the R language itself but the software infrastructure that surrounds it. Multiple versions of R, packages, R Studio, Jupyter notebook, Microsoft R Open, SageMathCloud and the way that various applications such as Mathematica, Maple and Visual Studio interact with R.

I chose to deliver the material in the same way that The Code Cafe is delivered - self directed material where I act as facilitator. This seemed to work really well and there was a lot of conversation and interaction with the audience that I find is missing when doing a more traditional presentation.

Course material is at https://github.com/mikecroucher/R_awareness
Year(s) Of Engagement Activity 2016
URL http://www.walkingrandomly.com/?p=6115
 
Description RCUK Cloud Computing Workshop 2018 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The RCUK Cloud Working Group hosted this workshop to bring to together researchers and technical specialists to share their experiences in the application of cloud computing technology for the research community. The meeting included presentations from a range of research domains including particle physics, the environmental sciences, medical research and bioinformatics.
Year(s) Of Engagement Activity 2018
URL https://cloud.ac.uk/
 
Description RSE Leaders meetings 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The RSE-Leaders network is made up of leaders of Research Software Engineering groups in universities and research institutes around the UK. The aim of the network is to co-ordinate RSE activities nationally as well as to disseminate good practice in this emerging community.
Year(s) Of Engagement Activity 2015,2016,2017
 
Description Research software management, sharing and sustainability workshop (University of Birmingham) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Jisc, in collaboration with the University of Birmingham invited all researchers interested and passionate about developing or using research software to join a free workshop on this subject.

The aims were to:

- Bring a range of experts who can answer and guide delegates with your most critical issues
- Provide delegates with a list of available resources on the subject and tailored to the problems that they encounter when managing research code
- Listen and collate the most common problems that you are having in this area
Year(s) Of Engagement Activity 2017
URL http://www.birmingham.ac.uk/events/events/Research-software-management-sharing-and-sustainability-wo...
 
Description Research software management, sharing and sustainability workshop (University of Sheffield) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Jisc, in collaboration with The University of Sheffield, have organised a free workshop for researchers who are interested in developing or using research data software.

We are wanting to collate common problems and share experiences around managing and sharing software, as well as developing new solutions.

There will be experts on hand to answer any questions and inform participants of available resources.
Year(s) Of Engagement Activity 2017
URL https://www.sheffield.ac.uk/library/libnews/researchworkshop
 
Description SSI Collaborations Workshop 2017 (CW17) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The Software Sustainability Institute's Collaborations Workshops series brought together researchers, developers, innovators, managers, funders, publishers, leaders and educators to explore best practices and the future of research software. Collaboration Workshop 17 (CW17) took place from 27th to 29th March 2017 at the Leeds University Business School, University of Leeds.

The theme of the workshop was be The Internet of Things (IoT) and Open Data: implications for research.
Year(s) Of Engagement Activity 2017
URL https://www.software.ac.uk/cw17
 
Description Sheffield R Users Group 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Member of the organising committee of the Sheffield R Users Group.
SheffieldR provides a space for anyone with an interest in R to meet up, hear from their peers about R packages and implementations and network. The events are free and anyone, at whatever level of R skill, is welcome to attend.
Year(s) Of Engagement Activity 2017,2018
URL https://www.meetup.com/SheffieldR-Sheffield-R-Users-Group/events/
 
Description Sheffield R Users Group - Hacktoberfest Hacky Hour series 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Ran weekly Hacktoberfest hacky hour sessions throughout October for the R Users group. Hacktoberfest is a month-long celebration of open source software ran through a partnership between GitHub and Digital Ocean with t-shirt prizes for participants who complete the challenge. The sessions introduced participants to contributing to open source and gave them the opportunity to practice using open source collaborative tools.
Year(s) Of Engagement Activity 2017
URL http://rse.shef.ac.uk/blog/sheffieldR-hacktoberfest/
 
Description Sheffield R Users Group Talk - Literate Programming 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Gave a talk and demonstration of project organisation and documentation through the workflowr package in R at the Novermber 2017 Sheffield R Users group meetup for 16 attendees (https://www.meetup.com/SheffieldR-Sheffield-R-Users-Group/events/244746978/)
Year(s) Of Engagement Activity 2017
URL https://github.com/annakrystalli/workflowr-demo
 
Description Sheffield R Users Group Talk - Literate Programming 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Gave a talk and demonstration of different approaches to literate programming in R at the May 2017 Sheffield R Users group meetup for 18 attendees (https://www.meetup.com/SheffieldR-Sheffield-R-Users-Group/events/240261060/)
Year(s) Of Engagement Activity 2017
URL https://github.com/annakrystalli/lit-prog
 
Description Sheffield Research Software Engineering Blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The Sheffield RSE blog is intended to showcase Research Software Engineering activities at University of Sheffield. It serves as a hub for all RSE activities within the University.
Year(s) Of Engagement Activity 2017,2018
URL http://rse.shef.ac.uk/blog/
 
Description Software Sustainability Collaborations Workshop 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This event was arranged by the Software Sustainability Institute and was focused on credit in academic software.
Year(s) Of Engagement Activity 2016
URL https://www.software.ac.uk/blog/2016-04-28-collaborations-workshop-2016-cw16-report
 
Description Talk at l'Université Paris-Sud - PRATIQUES COLLABORATIVES & COMPÉTENCES NUMÉRIQUES 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact An event to discuss collaborative research computing practices in France including Research Software Engineering and data science.
Year(s) Of Engagement Activity 2017
URL http://proto204.co/portfolio/codefoc/
 
Description University of Lancaster Visit (Is Your Research Software Correct) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A seminar called 'Is Your Research Software Correct?' that discusses the problems inherent in research software practices and what can be done about them.
Year(s) Of Engagement Activity 2016
URL https://github.com/mikecroucher/MLPM_talk
 
Description University of Nottingham visit (Is Your Research Software Correct) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact A seminar called 'Is Your Research Software Correct?' that discusses the problems inherent in research software practices and what can be done about them. It was delivered to the School of Mathematical sciences.
Year(s) Of Engagement Activity 2016
URL https://github.com/mikecroucher/MLPM_talk
 
Description Visit to Sparx - Is your research software correct? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Sparx is a scientifically driven and research based educational technology company who are using technology, data and real world classroom observation to scientifically investigate the way young people learn. https://www.sparx.co.uk/

They employ a data science team who have a workflow that is very similar to that of academic researchers. As such, they face a very similar set of challenges. I visited the company in February 2017 to give a talk 'Is your research software correct?' and discussed best practice with them.
Year(s) Of Engagement Activity 2017
 
Description WalkingRandomly blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact WalkingRandomly is a blog focused on research software, high performance computing and mathematics. It is widely read and cited by community members in these subject areas and receives between 250,000 and 500,000 visitors a year.
Year(s) Of Engagement Activity Pre-2006,2007,2009,2011,2013,2015,2017
URL http://www.walkingrandomly.com/
 
Description Warick University: Bad software and how it is ruining your research 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact I gave a talk to researchers at The University of Warwick about good software practices in Research.
Year(s) Of Engagement Activity 2017
URL https://warwick.ac.uk/fac/sci/wcpm/seminars/
 
Description Wolfram Research Training Day 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A day of training and workshops for members of the University of Sheffield on how to make good use of Mathematica and The Wolfram Language, a popular commercial computer algebra system and technical computing environment. The event was a collaboration between Research Software Engineering Sheffield and Wolfram Research and has led to a call to host a larger event in collaboration with University of Manchester in 2017.
Year(s) Of Engagement Activity 2016
 
Description Workshop on Nordic Big Biomedical Data for Action in Stockholm 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact NIASC and NeIC organised a three-day Nordic Workshop on Nordic Big Biomedical Data for Action which brought together practitioners, software developers and compute infrastructure providers from across the Nordic region. The aim of the event was to showcase what the community was doing in the areas of HPC, big data and how it could assist biomedical research.
Year(s) Of Engagement Activity 2016
URL http://www.nordicehealth.se/2016/12/04/workshop-on-nordic-big-biomedical-data-for-action/
 
Description ___ and R: How R relates to other technologies 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact This was a presentation given to the Sheffield R User's group, a grassroots club made up of enthusiasts, researchers and industrial users of R. My talk focused on how R interfaced with a range of other technologies such as High Performance and Cloud computing, commercial software packages and open source libraries. It sparked discussion and led to the discovery of academics who need help with their R code.
Year(s) Of Engagement Activity 2016
URL https://mikecroucher.github.io/x_and_R/#/1