Sustainability and EDI (Equality, Diversity, and Inclusion) in the R Project

Lead Research Organisation: University of Warwick
Department Name: Statistics

Abstract

Many large scientific software projects depend heavily on the research community for their maintenance and development. In the case of R, the free software environment for data-analytic computing and graphics, the core developers have mainly been people in traditional academic roles, such as statistics professors. As such, their focus has been on aspects of development related to their areas of research, with other necessary functions being done as service, that may or may not be recognised by their institutions.

This paradigm has led to a crisis of sustainability for the R project, since there has been insufficient investment in establishing open, sustainable development practices or mentoring new contributors. Many of the current core developers are past or nearing retirement, leaving the R project in a precarious situation. Since R is widely used in the development of research software, this is an issue that the research community must urgently address.

The current model of core development has also led to a diversity problem. Contributing in a substantial way has required a privileged academic position, with the security and flexibility to make time for this work, which has favoured white men in high-income countries. Although the core developers have acknowledged this issue, a lack of time, along with a limited perspective of the barriers faced by under-represented groups, has meant that little action has been taken.

Through this fellowship, I aim to establish Research Software Engineers (RSEs) as part of the solution to creating more sustainable and inclusive large-scale research software projects. I will demonstrate how RSEs can be critical contributors, by modelling this role in the context of the R project. One aspect will be contributing to the maintenance and development of fundamental code in the R project, either the core R codebase, or important add-on packages that are (or will be) orphaned. Making these contributions as a woman myself will demonstrate the potential for RSE positions to support contributors from under-represented groups. It will also show the potential for RSEs to take on work that may be less desirable to people in traditional academic roles (since it does not represent novel research). However, I intend to go beyond this and also demonstrate how RSEs can contribute to promoting sustainability and inclusion. This aspect will form a large component of my work and involve activities such as helping to make the core development process more transparent, mentoring short-term projects, and providing training.

Publications

10 25 50
 
Title Google Season of Docs project: Expand and Reorganize the R Development Guide 
Description The R Project Google Season of Docs 2022 project (GSoD 22) expanded and revised the R Development Guide, an online book aimed at new contributors to the open source software R. New sections included the message translation infrastructure used by R; installing R from source on Linux and Windows, and a GitHub workflow for testing proposed patches. Heather Turner was a project administrator and a member of the steering committee for this project. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact The improvements to the guide have assisted initiatives by the R Contributions Working Group to onboard new contributors to the open source software R. In particular, when mentoring new contributors in the R Contributor Office Hours or on the R Contributor Slack group, we have been able to point to some of the new sections which they have been able to follow in their own time to take the next step in their journey as open source contributors. 
URL https://developers.google.com/season-of-docs/docs/2022/participants
 
Title Google Season of Docs project: useR! Knowledgebase and Infoboard 
Description The R Project Google Season of Docs 2021 project (GSoD 21) created an organizer knowledgebase and information dashboard for the useR! conference. Together they provide historical information on past useR! conferences and guidance for conference for the organizing team. This is an improvement to research infrastructure since useR! is the main, international conference of R users and developers, enabling knowledge exchange between academia and practitioners (usually around half of the participants are from business, government or the non-profit sector). The organizing team is required to have substantial involvement of academic partners who take this task on as a service, so reducing organisational effort gives more time to focus on research. Heather Turner was a project administrator and a member of the steering committee for this project. 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact The outputs from GSoD 21 are helping to preserve information from one year to the next, ensuring good practices are carried forward and avoiding time spent reinventing the wheel. This is particularly important in the context of equality, diversity and inclusion, to ensure that practices that had a positive impact on widening participation are carried forward, so that progress made in one year is not lost in subsequent years. Having information available in a more accessible and navigable format lightens the load on useR! volunteers and provides valuable information to other conference organizers and the wider R community. 
URL https://developers.google.com/season-of-docs/docs/2021/participants
 
Title R package: nlsCompare 
Description An R package to compare R functions designed to solve nonlinear least squares (nls) problems. Heather Turner is a contributor to this package. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact This package was developed as part of a Google Summer of Code 2021 project to compare alternative nonlinear least squares (nls) solvers in R. Therefore its primary benefit is to the co-authors and other researchers, e.g., authors of R packages that provide nls functions, for validation and benchmarking purposes. Applying the package during the project produced some interesting results, such as a case where a solver outperformed the reference solver, generating some promising directions for future research. Heather Turner is a contributor to this package. 
URL https://github.com/ArkaB-DS/nlsCompare
 
Title RmdConcord: Concordances for 'R Markdown' v0.1.4 
Description This package supports concordances in 'R Markdown' documents. R Markdown is a form of literate programming that allows you to include R code chunks in a document that uses the Markdown language to mark up text. During compilation, the R Markdown file is first converted to a regular Markdown file by replacing each code chunk with the output of the running that code, then the Markdown file is converted to the final HTML output, that has both the output of the code and formatted text. A concordance is a mapping between lines in the HTML file and lines in the original R Markdown file. Heather Turner is a contributor to this package. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
Impact This software has only just been released, however it will be useful for tracing back errors produced by programs such as HTML Tidy that report only the HTML line numbers. Since HTML Tidy is used by the package checking tool included with R (R CMD check), this is particularly helpful for R package developers. 
URL https://cran.r-project.org/package=RmdConcord
 
Title certificate: R package for generating workshop attendance certificates 
Description This package is a spin-off from the fwdbrand package from the R Forwards taskforce, for generating certificates of attendance. Heather Turner is a co-author and maintainer of this package. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This package was used to provide certificates of attendance for participants of the Collaboration Campfires, so that they could count attendance as continuing professional development. 
URL https://github.com/forwards/certificate
 
Title gnm: package for generalized nonlinear models, v1.1-2 
Description The gnm R package provides function to specify and fit generalized nonlinear models. This new release extended the `Symm()` function to handle more than two factors. Heather Turner is a co-author and maintainer of this package. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This update allows the Complete Symmetry model to be estimated with more than two categorical factors, such as the example in Table 8.5 of Analysis of Ordinal Categorical Data (Agresti, 2010, DOI:10.1002/9780470594001). 
URL https://cran.r-project.org/package=gnm
 
Title rebib : Parse/Convert embedded LaTeX bibliography to BibTeX 
Description This package was developed as part of Google Summer of Code 2022 to convert the formatted bibliography in a LaTeX file (.bbl format) to a BibTeX bibliography (.bib format) for re-processing. Heather Turner is a contributor to this package. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact In 2021, The R Journal introduced a new HTML format and since then there has been a gradual transition from the legacy PDF format. Past instructions for authors asked them to embed the formatted bibliography generated by BibTeX into their LaTeX source file. This package converts the embedded bibliography to a .bib file, which is more useful when converting the article to HTML. This package is a dependency of the texor package, created under the same GSoC project. Together, the two packages enable previously published R Journal articles to be converted to HTML, which is easier to browse and more accessible for users of assistive tools and technologies such as screen readers. 
URL https://github.com/Abhi-1U/rebib
 
Title texor : Tools for converting LaTeX source files into RJ-web-articles 
Description This package was developed as part of Google Summer of Code 2022 to aid the conversion of LaTeX source files to HTML. A basic conversion can be achieved with pandoc, but equations, code blocks, tables, figures, cross-references and other aspects of the document may not be correctly converted. The texor package handles many of these issues to reduce the manual post-editing required. Heather Turner is a contributor to this package. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact In 2021, The R Journal introduced a new HTML format and since then there has been a gradual transition from the legacy PDF format. This package enables previously published articles to be converted to HTML, which is easier to browse and more accessible for users of assistive tools and technologies such as screen readers. 
URL https://github.com/Abhi-1U/texor
 
Description Bug BBQ 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Bug Barbecue (Bug BBQ) was a global, online event spread over 24 hours, in which new and experienced contributors can help to address open bugs in R. For new contributors, the event gave an introduction to the R development process and the opportunity to learn from more experienced contributors. For all participants - even members of the R Core team - the event provided a dedicated time to work on maintaining this open source project, which can be hard to find time for on top of other commitments. Heather Turner co-organized this event with members of the R Contribution Working Group.
Year(s) Of Engagement Activity 2022
URL https://github.com/r-devel/rcontribution/blob/main/bug_bbq/Bug_BBQ_retrospective.md
 
Description Collaboration Campfires 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Collaboration Campfires were a series of online collaborative events to open pathways for members of groups currently underrepresented in the R project. The goal was to demystify the R development process and highlight ways that R programmers can contribute, with a focus on low-level contributions in terms of time commitment and prerequisite knowledge. Some of the participants have gone on to become more involved in the R Contributor community, attending other events organized by the R Contribution Working Group, including a translation hackathon which resulted in new/updated translations being committed to the core R codebase. Heather Turner co-organized these events with Saranjeet Kaur, as part of the Code for Science and Society Digital Infrastructure Incubator.
Year(s) Of Engagement Activity 2022
URL https://contributor.r-project.org/events/collaboration-campfires
 
Description Incubator: The role of the R community in the RSE movement 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner co-organised an incubator session at the useR! 2021 conference on "The role of the R community in the RSE movement". It brought together around 30 members of the R community, with representatives from Europe, North America, the Middle East, Asia, Africa and Oceania. The group ranged from people that had not heard of the term "Research Software Engineer" to people that had RSE as their job title. The event consisted of a short talk on the RSE movement by an invited representative from the Society of RSE (SocRSE), followed by breakout discussions on topics such as promoting RSE career paths and promoting RSE skills within the R community.

The event helped to raise awareness of the RSE profession and ways to connect to the RSE community. An #r-users channel was created on the RSE Slack workspace, which 42 people have joined. People were encouraged to join the Society of RSE (SocRSE) and participate in the online SeptembRSE conference. One incubator attendee, Kim Martin, was subsequently awarded a Software Sustainability Institute fellowship to promote RSE skills at Stellenbosch University, S. Africa (https://fellows.software.ac.uk/fellow/kim-martin/) and credits useR! as a turning point. Another attendee, Saranjeet Kaur, was selected for the Open Life Science mentoring & training program to launch an RSE Association in Asia, inspired by SocRSE (https://openlifesci.org/ols-4/projects-participants/). Yet another attendee, Matt Bannert, has started recording a series of podcast episodes for the RSE Stories podcast (http://us-rse.org/rse-stories/about/, episodes yet to be aired).
Year(s) Of Engagement Activity 2021
URL https://user2021.r-project.org/blog/2021/09/04/role-of-r-in-research-software-engineering/
 
Description Letter to Nature Correspondence 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This letter was written by members of the global organizing team for the online useR! 2021 conference, including Heather Turner. Due to the restriction on the number of authors for this format, a single author submitted the correspondence on behalf of the team. The intention of the letter was to encourage professionals across domains to keep (scientific) conferences online, even when the COVID-19 pandemic subsides, in the interests of equality, diversity and inclusion.
Year(s) Of Engagement Activity 2021
URL https://www.nature.com/articles/d41586-021-02752-8
 
Description News from the Forwards Taskforce (R Journal) 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner writes a regular news column for the R Journal, updating R users and developers about the activities of Forwards, the R Foundation taskforce for women and underrepresented groups. As chair of the taskforce, Heather rounds up the news from the different sub-teams. This covers Forwards' involvement in community groups, conference organization, initiatives to help R users advance their knowledge and experience of R development, collection and analysis of data related to equality, diversity and inclusion in the R community. This helps to raise awareness of the work of Forwards and to give credit to the people involved, who do this as service work, often in their spare time.
Year(s) Of Engagement Activity 2021,2022,2023
URL https://journal.r-project.org/news.html
 
Description Post on R Blog: "R Can Use Your Help: Translating R Messages" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner co-wrote this blog post with Saranjeet Kaur, a member of the R Contribution Working Group. The post was intended to inform people about the current status of message translations in the R codebase and encourage people to get involved in contribution. As a result, several people joined the R Contributors Slack group to find out more and/or started contributing via the new Weblate server, an online platform facilitating translation.
Year(s) Of Engagement Activity 2022
URL https://blog.r-project.org/2022/07/25/r-can-use-your-help-translating-r-messages/index.html
 
Description R Consortium Blog post: Comeback! Reviving the Warwick R User Group with In-Person and Online Events 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As a member of the organizing team of the Warwick R User Group, Heather Turner was interviewed about the team's experience re-building the group after the COVID-19 pandemic. The blog post was liked and shared by dozens of people across social media outlets.
Year(s) Of Engagement Activity 2023
URL https://www.r-consortium.org/blog/2023/01/25/comeback-reviving-the-warwick-r-user-group-with-in-pers...
 
Description R Contributor Office Hours 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner and Ella Kaye, along with colleagues from the R Contribution Working Group, have started to run office hours for contributors to R. These events have attracted people that want to find out more about contributing and we have been able to provide information, give live demonstrations and guide them towards next steps. The events have also attracted existing contributors who have got stuck on some technical detail or want advice on how to progress the issue they are working on.
Year(s) Of Engagement Activity 2022,2023
URL https://contributor.r-project.org/events/office-hours/
 
Description R Contributors Website 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner expanded the R Contributors website, to create a more substantial resource for people interested in contributing to the R Project. She added an events page, tutorials, links to resources such as the R Developer Blog, the R Development Guide and a dashboard for tracking development of the R project. She also obtained approval from the R Foundation to link directly from the main R Project website and moved to an official subdomain: contributor.r-project.org. As of 12 March 2022, website analytics show over 2000 unique visitors since moving to the new subdomain at the beginning on January 2022, with several hundred visitors exploring key subpages, e.g. the joining page for the R-devel Slack, the information page for the R Contribution Working Group and the events page for the Collaboration Campfires aimed at novice contributors. Visitors come from all major geographical regions, the top 10 countries by number of visitors are USA, UK, India, China, Brazil, Germany, Russian Federation, Indonesia and Australia. The improved website and new subdomain have attracted more members to the R-Devel Slack workspace and supported other engagement activities.
Year(s) Of Engagement Activity 2021
URL https://contributor.r-project.org/
 
Description R translatón/Hackaton de tradução do R 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner supported local organizers to put on this translation hackathon as a satellite to the LatinR2022 conference, to specifically encourage contributions to the Spanish and Brazilian Portuguese translations in R. The event attracted completely new contributors as well as those with some experience. During the event, the participants contributed around 500 message translations and some have continued to participate since.
Year(s) Of Engagement Activity 2022
URL https://www.youtube.com/watch?v=5LURMdf1Uk8
 
Description R-Ladies Remote Package Development Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner co-taught an R Package Development workshop for R-Ladies Remote, using the materials created by the teaching team of Forwards, the R Foundation taskforce for underrepresented groups. The workshop was split into four 1-hour modules over 4 weeks, enabling people to attend modules according to their availability and prior knowledge/experience. Around 25 people attended in total. Several participants of the workshop joined the R-Ladies Remote Slack, which is where the regular activity of the R-Ladies Remote chapter takes place. The workshops were recorded and the videos were shared in the Slack workspace, so that people could catch up later. Since the workshop was delivered live, it was only practical for R users in the Americas, Europe and Africa to attend, however there was interest in repeating the series at a time suited to Asia and Oceania.
Year(s) Of Engagement Activity 2022
URL https://www.meetup.com/rladies-remote/events/qvxsrsydccbwb/
 
Description RSE incubator blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post reporting on an incubator session that Heather Turner co-organized at the global useR! conference on the role of the R community in the research software engineering movement. The blog post was intended to increase the impact of the interactive, live session, which involved 7 breakout discussions that were not recorded. The post includes a recording of the short talk at the start of the session given by a representative of the Society of Research Engineering, that has been viewed 381 times (as of 7 March 2022).
Year(s) Of Engagement Activity 2021
URL https://user2021.r-project.org/blog/2021/09/04/role-of-r-in-research-software-engineering/
 
Description SocRSE blog post (2021 RSE Fellows series) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post as part of a series introducing the 2021 RSE Fellows, intended to promote our activities to the research software engineering community and beyond. For Heather Turner, the post helped to promote her fellowship plans to the wider community of R users and developers, in academia and industry. The post included a 2 minute video used to promote the blog post on Twitter, which has been viewed more than 2500 times as of 7 March 2022 (https://twitter.com/ResearchSoftEng/status/1422866514382893060?s=20&t=Oxfuz0-AmNGsCEa7MZM2Jw).
Year(s) Of Engagement Activity 2021
URL https://society-rse.org/getting-to-know-your-2021-rse-fellows-heather-turner/
 
Description The R Podcast: Collaboration Campfires Episode 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Heather Turner was interviewed along with a collaborator Saranjeet Kaur on an episode of The R Podcast, a long-established podcast of the R community. They were promoting the Collaboration Campfires, a series of events that they were organizing to encourage new contributors to the R project. This gave an opportunity to discuss the broader context of efforts to improve sustainability and equality, diversity and inclusion in the R project. The podcast stimulated some discussion on Twitter about this broader context. A potential industrial collaborator based in Canada connected over LinkedIn as a result of listening to the podcast and arranged a meeting during a later visit to the UK.
Year(s) Of Engagement Activity 2022
URL https://r-podcast.org/034-collaborative-campfires/
 
Description Workshop on Parallel Computing in R 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This workshop was intended to introduce R users to parallel computing in the context of high performance computing (HPC) clusters, to encourage wider use of Sulis, a Tier-2 HPC platform hosted at Warwick University. The workshop was open to any UK researcher or anyone that might collaborate with a UK researcher. Heather Turner helped to plan the workshop, including defining the scope, recommending and inviting an external tutor, reviewing tutorial materials and assisting during the workshop. The external tutor was Michael Schubert (Netherlands Cancer Institute), the author of a key package to facilitate working on HPC platforms using R. There were around 20 participants at the workshop, mainly from Warwick, but including participants from several other UK universities. In addition to upskilling intended participants, the workshop enabled Heather and other Research Software Engineers at Warwick to fill gaps in their knowledge, equipping them to provide better support for R users that want to use HPC facilities.
Year(s) Of Engagement Activity 2022
URL https://warwick.ac.uk/research/rtp/sc/sulis/events/parallelr/