📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Cloud-SPAN: Specialised analyses for environmental 'omics with Cloud-based High Performance Computing

Lead Research Organisation: University of York
Department Name: Biology

Abstract

Environmental Biotechnology (EB) addresses global challenges using engineered microbial systems for environmental protection, bio-remediation and resource recovery. It is a critical and expanding area for the UK and underpins some of the world's most important industries. This is acknowledged by the funding invested in the creation of BBSRC's Networks in Industrial Biotechnology and Bioenergy (NIBBs).
A deep mechanistic understanding of the complex microbial communities involved in the biological cycling of global resources is essential to meet global challenges such as Net Zero, waste management and increased demand. The complexity of these microbiomes can be orders of magnitude larger than those found in the human gut, requiring different approaches to experimental design and analysis with High Performance Computing (HPC). However, EB is an interdisciplinary field that attracts researchers from a broad range of disciplines including Mathematics, Engineering, Biology, Social Sciences, Management, Physics and Chemistry and big data 'omics analyses on HPC systems are often not core skills for such researchers.
This proposal aims to develop and deliver highly accessible resources that will upskill these interdisciplinary researchers so that they are able to generate and analyze big data relating to EB using Cloud HPC. Although infrastructure and resources exist for microbial 'omics (e.g. JGI's IMG/M, MG-RAST, Galaxy, CLIMB, EBI) there is a lack of systematic training tightly linked to the EB domain and documentation is often focussed on technical proficiency rather than contextualised with a strong understanding of experimental design. We will provide foundational training and develop and deliver new advanced modules covering the specialised skills required to generate and analyse 'omics data using Cloud HPC resources. These will include experimental design and statistical modules to ensure researchers can generate data appropriate to investigate their research question. Modules will deploy cloud-based containerised instances provided by Google Education and Amazon Web Services (AWS) for exemplar workflows free to the learner. They will form a complete training resource with fully articulated prerequisites and learning objectives that can be used for in-person or online tutor-led workshops or self-paced learning. Our proposal offers structured Learning Paths from the statistical skills required for robust experimental design through to the reproducible execution and interpretation of 'omics analyses with HPC to cater to researchers with differing levels of previous experience and which allow self-assessment of training needs. We also provide Diversity Scholarships to enable members of underrepresented groups to participate in online or in-person training.
The collaboration between the University of York and the Software Sustainability Institute (SSI) brings together excellence in data science pedagogy and environmental 'omics research with the SSI's UK-leading expertise in research computing and community building. This will ensure the training developed genuinely complements, and aligns with, existing materials to enhance national provision.
Sustainability will be fostered by making the resources Findable, Accessible, Interoperable and Reusable (FAIR), providing cross-platform images for deployment and by developing and proactively engaging with a Community of Practice and providing Code Retreats for the supported practice of methods to participants' own data. In addition, "Cloud Administration Guides" will be developed for institutional HPC Teams to run specialised modules with their own resources. These Guides will be supported 1-to-2 day training by Cloud-SPAN systems administrators.
The project will be promoted by our partners, the SSI, Google Education, AWSand the N8 Centre of Excellence in Computationally Intensive Research and through delivery of conference talks and seminars.

Technical Summary

Environmental Biotechnology (EB) is an interdisciplinary field involving advanced molecular and applied microbiologists, environmental chemists and engineers that addresses global challenges using engineered microbial systems. This proposal aims to trainer these researchers to generate, analyse and mine big data relating to EB microbiomes which are larger than those found in the human gut and require different approaches to both measurement and analysis in order to manage reagent costs and effectively leverage available HPC resources.
Easily accessible and scalable HPC-based training is required to provide researchers with the skills and self-confidence to manipulate and analyse big data generated from 'omics technologies and generate biological insights from these highly interconnected systems. This area of bioinformatics involves a steep learning curve which can be confounded by the need to install packages with multiple dependencies onto different HPC architectures based on what is available at a researcher's home institution, even before the user can engage with writing scripts to manage workflows, manipulate or visualise data, or manage job schedulers. We will deploy Cloud-based containerised instances which are (1) accessible to anyone anywhere as long as they have an adequate internet connection, (2) have a very low hardware entry requirement and (3) allow for easily scalable and replicable installations of software that will not become deprecated as quickly as might occur on a local server. The cloud providers we are working with run grant schemes that provide significant resources to researchers that will support deployment of production instances of the images we will generate. We expect that our resources will be easily deployed and used by groups who do not necessarily have devoted bioinformaticians or expertise in HPC, providing a cost-effective route to useful analyses for researchers in a strategically key area of expansion in the biosciences.
 
Title Automated Management of AWS Instances for Training 
Type Of Art Image 
Year Produced 2024 
URL https://zenodo.org/doi/10.5281/zenodo.12748960
 
Description Member of The Biochemical Society Training Theme Panel. 2019 - present.
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact The Training Theme Panel oversees the training remit of the Society, working to encourage, commission, generate and review proposals for face-to-face and online training courses in the biosciences throughout the year. Cloud-SPAN materials were utilised to develop the online python course and in addition the experience and evaluation data gathered from the Cloud-SPAN project helped to inform training strategy for the Society.
URL https://www.biochemistry.org/events-and-training/research-areas-and-theme-panels/training-theme-pane...
 
Description Reporting of Projects on behalf of 9 DaSH funded projects to secure an additional £200,000 per project.
Geographic Reach National 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
Impact The submission of the report secured additional funding to enable additional training resources to be created and additional training workshops to be hosted.
 
Description 'UKRI Digital Research Skills Catalyst' to opportunity OPP630: "Digital infrastructure: new approaches to skills or software"
Amount £496,170 (GBP)
Funding ID Not yet allocated - UKRI299: UKRI Digital Research Skills Catalyst 
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 09/2024 
End 03/2027
 
Description Getting started with High Performance Computing: FAIR training for environmental scientists
Amount £29,513 (GBP)
Funding ID NE/X006999/1 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 09/2022 
End 06/2023
 
Description Metagenomics with High Performance Computing for environmental science Doctoral Training
Amount £54,449 (GBP)
Funding ID NE/Y003527/1 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 07/2023 
End 05/2024
 
Title Improvements to research infrastructure - Learning and teaching resources: Statistically useful experimental design - 01 
Description Learning and teaching resources for Experimental design. Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability. Case-studies are provided to learners to examine in detail and discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Improve skills of researchers. 
URL https://cloud-span.github.io/experimental_design01-principles/
 
Title Improvements to research infrastructure - Learning and teaching resources: Statistically useful experimental design - 02 
Description Learning and teaching resources for Experimental design. Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability. Case-studies are provided to learners to examine in detail and discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Improve the skills of researchers. 
URL https://cloud-span.github.io/experimental_design02-case-study/
 
Title Learning and teaching resources: Core R Workshop - June 2023 - Online 
Description These open access online learning materials provide an introduction to R for complete beginners. Learners will be able to navigate their way round RStudio, use the basic data types and structures in R and how to organise their work with scripts and projects. It also teaches learners how to import data, summarise it and create and format a graph. The workshop assumes no prior experience of coding. Impact: increase knowledge, practical skills and confidence to apply the new skills in own projects. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Enables the learner to develop skills in R and provides instructors with teaching materials to deliver the training . 
URL https://github.com/Cloud-SPAN/core-r
 
Title Learning and teaching resources: Metagenomics for environmental scientists 
Description These open access online learning materials teach data analysis for metagenomics projects. They are aimed at those with little or no experience of using high performance computing (HPC) for data analysis. The materials cover: navigating file directories and using the command line logging into a remote cloud instance using common commands and running analysis programs in the command line what is metagenomics? following a metagenomics analysis workflow including: performing quality control on reads assembly of reads into a metagenome improving your assembly with polishing binning into species/metagenome-assembled genomes (MAGs) taxonomic assignment and functional annotation using your binned reads 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Enables the learner to develop skills in data analysis for metagenomics projects and provides instructors with teaching materials to deliver the training . 
URL https://cloud-span.github.io/nerc-metagenomics00-overview/
 
Title Learning and teaching resources: Statistically useful experimental design 
Description Learning and teaching resources for Experimental design. Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability. Case-studies are provided to learners to examine in detail and discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact Enables the learner to develop skills in R via self-study and provides instructors with teaching materials to deliver the training . 
URL https://cloud-span.github.io/experimental_design00-overview/
 
Title Learning and teaching resources: Statistically useful experimental design - Overview 
Description Learning and teaching resources: Statistically useful experimental design Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability. Case-studies are provided to learners to examine in detail and discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Improve skills of researchers 
URL https://cloud-span.github.io/experimental_design00-overview/
 
Title Teaching materials: Automated Management of AWS Instances 
Description These teaching resources guide learners to automatically manage multiple Amazon Web Services (AWS) instances - each instance being a Linux virtual machine. Using Bash Shell scripts, learners are shown how to create, configure, stop, start and delete one or multiple instances with a single invocation of a script. The target audience of the course is anyone interested in deploying and managing cloud resources for training. While the course is focused on AWS, and particularly Elastic Compute Cloud (EC2) instances, the scripts can be adapted for use with other cloud providers and other types of cloud services. 
Type Of Material Improvements to research infrastructure 
Year Produced 2024 
Provided To Others? Yes  
Impact These learning resources allow other course instructors at other institutions to utilise AWS instances when hosting a training course which facilitates the provision of easily accessible training for future bioscientists. 
URL https://github.com/Cloud-SPAN/aws-instances?tab=readme-ov-file
 
Description Collaboration with Software Sustainability Institute 
Organisation Software Sustainability Institute
Country United Kingdom 
Sector Public 
PI Contribution The project team participates in quarterly management meeting with Neil Chue Hong, Director of the Software Sustainability Institute.
Collaborator Contribution Neil Chue Hong, Director of the Software Sustainability Institute participates in a quarterly management meeting to provide guidance and strategic advice on different aspects of the project.
Impact During our meetings we are able to discuss best practice in regards to; creating training materials, managing educational activities, engagement of the general public.
Start Year 2021
 
Description Collaboration with the White Rose DTP 
Organisation White Rose University Consortium
Country United Kingdom 
Sector Academic/University 
PI Contribution 2023/12 Emma Rand has designed an advanced course in Metagenomics as the White Rose DTP contribution to the Inter-DTP Skills programme for four BBSRC DTPs (White Rose, Oxford Interdisciplinary Bioscience DTP, Midlands Integrative Biosciences Training Partnership, South West Biosciences Doctoral Training Partnership. This will be delivered in May 2024.
Collaborator Contribution 1. This collaboration has promoted Cloud-SPAN training opportunities to new training cohorts involved in four BBSRC funded DTPs (White Rose, Oxford Interdisciplinary Bioscience DTP, Midlands Integrative Biosciences Training Partnership, South West Biosciences Doctoral Training Partnership). 2. This has allowed widened exposure to the Cloud-SPAN training programmes and resources.
Impact 1. This collaboration has allowed widened exposure nationally to the Cloud-SPAN training programmes and resources
Start Year 2023
 
Description EBnet Collaboration 
Organisation UK Environmental Biotechnology Network
Country United Kingdom 
Sector Academic/University 
PI Contribution James Chong and Sarah Forrester are active members within the EBnet Working Group. Through their work with EBnet they are able to publicise the Cloud-SPAN project, via sharing information and delivering talks at webinars.
Collaborator Contribution EBnet support and promote the training opportunities created through the Cloud-SPAN project. The collaboration also allows members to exchange expertise in the field of HPC driven microbial genomics research, which in turn improves the quality of the Cloud-SPAN training resources.
Impact James Chong and Sarah Forrester are active members within the EBnet Working Group. Through their work with EBnet they are able to publicise the Cloud-SPAN project, via sharing information and delivering talks at webinars.
Start Year 2021
 
Description Link with The Carpentries 
Organisation The Carpentries
Country United States 
Sector Charity/Non Profit 
PI Contribution The collaboration between Cloud-SPAN and The Carpentries has facilitated knowledge exchange between the two organisations.
Collaborator Contribution The collaboration between Cloud-SPAN and The Carpentries has facilitated knowledge exchange between the two organisations.
Impact Knowledge exchange and best practice.
Start Year 2021
 
Title Cloud-SPAN AWS Instance Management scripts 
Description Scripts for managing multiple AWS instances and some data examples: gc_run01_data and gc_run02_data 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.6779268
 
Title Cloud-SPAN AWS Instance Management scripts 
Description Scripts for managing multiple AWS instances and some data examples: gc_run01_data and gc_run02_data 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Impact: allows course instructors at other institutions to easily utilise AWS instances when hosting a training course which facilitates the provision of easily accessible training for future bioscientists. 
URL https://zenodo.org/record/6779269
 
Title Cloud-SPAN/create-aws-instance-2-manage-instance: Cloud-SPAN Create Your Own AWS Instance Session 2: Manage Your Instance 
Description Cloud-SPAN Create Your Own AWS Instance Session 2: Manage your instance Cloud-SPAN is a collaboration between the Department of Biology, University of York and The Software Sustainability Institute funded by the UKRI innovation scholars award (Project Reference: MR/V038680/1) which trains researchers, and the research software engineers that support them, to run specialised analyses on cloud-based high-performance computing infrastructure The site infrastructure is based on The Carpentries. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
URL https://zenodo.org/record/6783171
 
Title Web-based App - Self-Assessment Quiz 
Description Using the Shiny an R package, an online interactive web-based app was created in order to evaluate the level of competence of a participant. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Impact This online self-assessment tool has been invaluable to determine the level of competence of participants; based on the results course participants can continue their registration to either an introductory course or an advanced course. 
URL https://shiny.york.ac.uk/er13/prenomics-quiz/#section-why
 
Description 25 March 2024 participation at the NorthernBug meeting and presentation of Management of AWS Instances 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact At the NorthernBUG meeting in Sheffield, a poster was discussed and presented by Pasky Miranda on behalf of Cloud-SPAN to promote the new training materials related to training resources on Automated Management of AWS Instances. This training module allows non-experts in this area to learn to configure, create, manage, and delete, one or multiple AWS instances. This instructor training is appropriate for anyone interested in deploying and managing cloud resources.
Year(s) Of Engagement Activity 2024
URL https://github.com/Cloud-SPAN/cloud-span-graphics/blob/main/misc/Final-AutomatedMgtAWSInstances-Nort...
 
Description BBSRC White Rose DTP in Mechanistic Biology training day 27-03-2024 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact Spring Training Day, 27th March 2024, University of York
The Spring Training Day will bring together the whole DTP cohort on 27th March at the University of York.

Click on each of the sections below for more information about the training for each year group.

There will also be time for social activities and cohort building.
Year(s) Of Engagement Activity 2024
URL https://www.whiterose-mechanisticbiology-dtp.ac.uk/training-and-events/archive/spring-training-day-2...
 
Description Blog Series March 2023-2024 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A series of 9 blogs were created from March 2023 to March 2024 in order achieve the following: - Promote the Cloud-SPAN project to increase awareness of the projects' objectives and achieve recognition within the biology/HPC community and with other universities and similar training providers - Promote individual training activities and attract registrations for courses - Attract new followers on our social media account These objectives were achieved as following the promotion of the blogs, we received enquiries regarding the project, new registrations for training activities, new followers on Twitter and LinkedIN and we also have built up relationships with other organisations and universities who support the promotion of our activities.
Year(s) Of Engagement Activity 2023,2024
URL https://cloud-span.york.ac.uk/blogs.html#listing-listing-page=1
 
Description Blog re Genomics Course November 2021 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A blog was written evaluating the first Genomics Course which was delivered in November 2021. The blog was posted on the Cloud-SPAN forum and included on the SSI's website https://software.ac.uk/news/review-cloud-spans-genomics-course
Year(s) Of Engagement Activity 2021
URL https://cloudspan.peerboard.com/post/1021906833
 
Description Blog series to publicise the project and activities 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A series of 22 blogs were created in order achieve the following:
- Promote the Cloud-SPAN project to increase awareness of the projects' objectives and achieve recognition within the biology/HPC community and with other universities and similar training providers
- Promote individual training activities and attract registrations for courses
- Attract new followers on our social media account

These objectives were achieved as following the promotion of the blogs, we received enquiries regarding the project, new registrations for training activities, new followers on Twitter and LinkedIN and we also have built up relationships with other organisations and universities who support the promotion of our activities.
Year(s) Of Engagement Activity 2021,2022,2023
URL https://cloudspan.peerboard.com/
 
Description Code Retreat - April 2022, University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The Code Retreat brings small communities of practice together to explore, learn and grow. This event helps broaden the knowledge and practical experience of the participants. They also have the network with individuals from different institutions and workplaces.

Learning outcomes from the event:
Working with your peers and with help from our instructors, you could:
- Revise our Prenomics or Genomics courses
- Get help organising and documenting your own analysis
- Apply tools taught in Genomics to your own data
- Get help with Creating your own Amazon Web Services instance for Genomics
- Network with other genomics researchers
Year(s) Of Engagement Activity 2022
URL https://cloud-span.york.ac.uk/community#h.ma5203jptwz0
 
Description Code Retreat - January 2023, University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The Code Retreat brings small communities of practice together to explore, learn and grow. This event helps broaden the knowledge and practical experience of the participants. They also have the network with individuals from different institutions and workplaces. Learning outcomes from the event: Working with your peers and with help from our instructors, you could: - Revise our Prenomics or Genomics courses - Get help organising and documenting your own analysis - Apply tools taught in Genomics to your own data - Get help with Creating your own Amazon Web Services instance for Genomics - Network with other genomics researchers
Year(s) Of Engagement Activity 2023
URL https://cloud-span.york.ac.uk/community#h.ma5203jptwz0
 
Description Code Retreat - January 2024, University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The Code Retreat brings small communities of practice together to explore, learn and grow. This event helps broaden the knowledge and practical experience of the participants. They also have the network with individuals from different institutions and workplaces. Learning outcomes from the event: Working with your peers and with help from our instructors, you could: - Revise our Prenomics or Genomics courses - Get help organising and documenting your own analysis - Apply tools taught in Genomics to your own data - Get help with Creating your own Amazon Web Services instance for Genomics - Network with other genomics researchers
Year(s) Of Engagement Activity 2024
URL https://cloud-span.york.ac.uk/upcoming/Code%20Retreat/
 
Description Code Retreat - May 2023 - University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact The Code Retreat brings small communities of practice together to explore, learn and grow. This event helps broaden the knowledge and practical experience of the participants. They also have the network with individuals from different institutions and workplaces.
Learning outcomes from the event:
Working with your peers and with help from our instructors, you could:
- Revise the course materials
- Get help organising and documenting their own analyses
- Apply tools taught in Meta/Genomics to their own data
- Get help with Creating Amazon Web Services instance for Genomics
- Network with other researchers.
Year(s) Of Engagement Activity 2023
URL https://cloud-span.york.ac.uk/upcoming/Core%20R/
 
Description Code Retreat - November 2022, University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The Code Retreat brings small communities of practice together to explore, learn and grow. This event helps broaden the knowledge and practical experience of the participants. They also have the network with individuals from different institutions and workplaces. Learning outcomes from the event: Working with your peers and with help from our instructors, you could: - Revise our Prenomics or Genomics courses - Get help organising and documenting your own analysis - Apply tools taught in Genomics to your own data - Get help with Creating your own Amazon Web Services instance for Genomics - Network with other genomics researchers
Year(s) Of Engagement Activity 2022
URL https://cloud-span.york.ac.uk/community#h.ma5203jptwz0
 
Description Core R Workshop - April 2024 - Online 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This online two-hour workshop is an introduction to R for complete beginners. It teaches learners how to find their way round RStudio, use the basic data types and structures in R and how to organise their work with scripts and projects. It also teaches learners how to import data, summarise it and create and format a graph. The workshop assumes no prior experience of coding. Impact: increase knowledge, practical skills and confidence to apply the new skills in own projects.
Year(s) Of Engagement Activity 2024
URL https://cloud-span.github.io/core-r/02_intro_to_r_and_working_with_data.html#1
 
Description Core R Workshop - June 2023 - Online 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This online two-hour workshop is an introduction to R for complete beginners.

It teaches learners how to find their way round RStudio, use the basic data types and structures in R and how to organise their work with scripts and projects.
It also teaches learners how to import data, summarise it and create and format a graph. The workshop assumes no prior experience of coding.

Impact: increase knowledge, practical skills and confidence to apply the new skills in own projects.
Year(s) Of Engagement Activity 2023
URL https://github.com/Cloud-SPAN/core-r
 
Description Creation of a Cloud-SPAN Slack channel. April 2022 - February 2024 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The Cloud-SPAN Channel was created in April 2022.
The channel provides an easily accessible platform to facilitate networking between learners and it also allows learners to get in touch with instructors to request assistance with course materials.
From April 2022 - February 2024 a total of 3,091 messages have been sent and there are 163 members.
Year(s) Of Engagement Activity 2022
URL https://join.slack.com/t/cloud-span/shared_invite/zt-2eiagvnde-_PNQY72Vax0FFJZ5RdWNTQ
 
Description Creation of the Cloud-SPAN Handbook 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact In our Cloud-SPAN community we encourage everyone to come together to find solutions to problems and exchange experiences and knowledge. Our aim is to build a friendly and involved community of people who have used our resources, are interested in our resources, or who have expertise in the areas we cover. Ways to contribute include attending one of our courses, asking/answering questions on our community forum and making suggestions for improvements to our courses.

Handbook
This handbook is intended as a reference for both the core Cloud-SPAN team and for our wider community of learners. It's where you'll find our Code of Conduct, contributing guidelines and other practical information which will help you make the most of our resources in a friendly, understanding environment.
Year(s) Of Engagement Activity 2021
URL https://cloud-span.github.io/CloudSPAN-handbook/index.html
 
Description Creation of the Cloud-SPAN LinkedIn Account 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A Cloud-SPAN account was created to help provide an online presence for the Cloud-SPAN project. Via LinkedIn the different training activities are promoted and allows the general public to ask any questions they may have.

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities
Year(s) Of Engagement Activity 2021
URL https://www.linkedin.com/company/cloud-span
 
Description Creation of the Cloud-SPAN Twitter Account 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A Cloud-SPAN Twitter Account was created to help provide an online presence for the Cloud-SPAN project. It also enables the following:
1. Promote the Cloud-SPAN project to an international online audience
2. Promote the registration of various activities
3. Host News stories and blogs
4. Promote information regarding scholarships

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities
Year(s) Of Engagement Activity 2021
URL https://twitter.com/SpanCloud
 
Description Creation of the Cloud-SPAN online Forum 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact In our Cloud-SPAN community we encourage everyone to come together to find solutions to problems and exchange experiences and knowledge. Our aim is to build a friendly and involved community of people who have used our resources, are interested in our resources, or who have expertise in the areas we cover. Ways to contribute include attending one of our courses, asking/answering questions on our community forum and making suggestions for improvements to our courses.

The Cloud-SPAN forum is a place to ask questions, pick people's brains and share any insights you've gained during or after one of our courses. It will be the main hub of the Cloud-SPAN community of practice. We strongly encourage you to engage with the Cloud-SPAN community to enhance your learning and understanding.

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities. It also provides an opportunity for audience to learn and develop their knowledge and skills.
Year(s) Of Engagement Activity 2021
URL https://cloudspan.peerboard.com/
 
Description Creation of the Cloud-SPAN website 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A Cloud-SPAN website was created to help provide an online presence for the Cloud-SPAN project. It also enables the following:
1. Promote the Cloud-SPAN project to an international online audience
2. Promote and organise registration of various activities
3. Host News stories and blogs
4. Promote information regarding scholarships
5. Provides a platform for individuals to ask for further information or ask any questions

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities
Year(s) Of Engagement Activity 2021
URL https://cloud-span.york.ac.uk/
 
Description EBNet Webinar: Using Big Data Approaches to Understand Microbial Communities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact EBNet Webinar: Using Big Data Approaches to Understand Microbial Communities
Thursday, 10th February 2022 at 13.00 - 14.15.
The SESSION RECORDING is available here https://www.youtube.com/watch?v=1QH0JK0X0Xw

EBNet are hosting a series of specialist webinars to support knowledge exchange amongst members. "Using Big Data Approaches to Understand Microbial Communities". Hear the latest developments from top speakers and participate in the online chat to engage with questions.

This fascinating session is brought to you by the Chairs: Dr Sarah Forrester, the Chong Group, Dept. of Biology, University of York & Dr Bing Guo.

Dr Sarah Forrester is a PDRA within James Chong's group within the Biology department at the University of York. She gained her PhD at the University of Liverpool in 2016 using multi 'omic approaches to analyse parasite genomic data, and has worked since then on a range of microbial systems and used a variety of bioinformatic methods. She performs HPC driven microbial genomics research and delivers bioinformatics training. As a 2022 Software Sustainability fellow and a certified Software Carpentry instructor, she is passionate about instilling good bioinformatic practises into her training. She is also involved in the preparation and delivery of the material for Cloud-SPAN: Specialised analyses for environmental 'omics with Cloud-based High Performance Computing , see https://cloud-span.york.ac.uk/.

TALK TITLE: INTRODUCTION TO THE EBNET BIOINFORMATICS WORKING GROUP
Prof James Chong is a Royal Society Industry Fellow and Professor in the Department of Biology at the University of York, where he runs a research group exploiting a range of 'omics techniques to understand microbial community dynamics, as well as leading the EBNet Working Group "Bioinformatics Training for Microbial Environmental Biotechnologies". His group is involved in generating microbial community metagenomics, meta-transcriptomics and metabolomics datasets. His group use established analytical pipelines, but also develop their own bespoke scripts for data analysis. Insight into the application of 'omics techniques, and the ways in which they can be applied to environmental biotechnology use cases to greater understand microbial community dynamics, has driven his desire to develop bioinformatic training resources. This is currently being supported by the UKRI Grant Cloud-SPAN: Specialised analyses for environmental 'omics with Cloud-based High Performance Computing, and is co-led by James, see https://cloud-span.york.ac.uk/.

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities
Year(s) Of Engagement Activity 2022
URL https://ebnet.ac.uk/ebnet-rc22-bigdata/
 
Description EBNet Webinar: Why Bioinformatics Training is Important Emma Rand presented a talk on Cloud-SPAN - 11 May 2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This seminar focused on why Bioinformatics Training is important a general introduction was delivered by Prof. James Chong:
Introduction "Why bioinformatics training is important": Prof. James Chong, University of York (Head of the Bioinformatics Working Group)

Emma Rand is a Senior Lecturer in the Department of Biology at the University of York where she specialises in teaching data science and reproducibility, particularly to those who do not see themselves as programmers. Delivered a presentation on the training opportunities and resources which are provided by the Cloud-SPAN project. This promoted all activities which were open for registration; along with the website, LinkedIN and Twitter accounts.

Link to event https://ebnet.ac.uk/wgbioinf-110522/
Year(s) Of Engagement Activity 2022
 
Description EBNet Working Group Coordinator 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact James Chong is the Working Group Chair for EBnet. This WG aims to create Bioinformatics training for microbial Environmental Biotechnologies. In this role James is able to make new connections and publicise the work of Cloud-SPAN.
Year(s) Of Engagement Activity 2021
URL https://ebnet.ac.uk/about/wg-details/wg-bioinformatics/
 
Description European Biosolids and Bioresource Conference 2022. 22nd and 23rd of November in Birmingham James Chong and Sarah Forrester 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact James Chong and Sarah Forrester delivered Bioinformatics session, 40 minute Q & A on metagenomics and bioinformatics and access to training

European Biosolids and Bioresource Conference 2022. 22nd and 23rd of November in Birmingham , ~40 people attended

13:35 - 13:50 Bioinformatics-based diagnostics for monitoring AD - Professor James Chong, University of York, UK
13:50 - 14:05 Using multi-omic approaches to understand the co-digestion of wheatstraw and sewage sludge - Dr. Sarah Forrester, Senior Post Doc, University of York, UK
Year(s) Of Engagement Activity 2022
URL https://european-biosolids.com/wp-content/uploads/2022/11/European-Biosolids-Bioresources-Conference...
 
Description Metagenomics online training course April 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A training course was organised online comprising online lectures, drop in help sessions, and a slack channel for support for 30 participants.

Following completion of this course, learners will be trained to :
explain the hierarchical structure of a file system and describe the files and file structure used in the course
explain what is meant by a working directory, a path and a relative path and write down paths that they will need for the course
start a Terminal (Mac) or Git Bash Terminal (Windows)
navigate a file system using the command line
log in to and exit their AWS instance (the cloud)
use common commands such as ls, pwd and cd, on the command line
know the difference between genomics and metagenomics
describe the steps in a metagenomic workflow
perform quality control on reads and assemble them into a metagenome
perform polishing to improve an assembly
use binning to separate the metagenome into different species or MAGs (Metagenome-Assembled Genomes)
use Kraken 2 to assign taxonomy to reads and contigs and phyloseq in R to analyse taxonomic diversity
Year(s) Of Engagement Activity 2023
URL https://sites.google.com/york.ac.uk/cloud-span/train-with-us/specialised-skills#h.jqgzsdc8hbla
 
Description Metagenomics online training course February 2024 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A training course was organised online comprising online lectures, drop in help sessions, and a slack channel for support for 30 participants.

Following completion of this course, learners will be trained to: explain the hierarchical structure of a file system and describe the files and file structure used in the course explain what is meant by a working directory, a path and a relative path and write down paths that they will need for the course start a Terminal (Mac) or Git Bash Terminal (Windows) navigate a file system using the command line log in to and exit their AWS instance (the cloud) use common commands such as ls, pwd and cd, on the command line know the difference between genomics and metagenomics describe the steps in a metagenomic workflow perform quality control on reads and assemble them into a metagenome perform polishing to improve an assembly use binning to separate the metagenome into different species or MAGs (Metagenome-Assembled Genomes) use Kraken 2 to assign taxonomy to reads and contigs and phyloseq in R to analyse taxonomic diversity.
Year(s) Of Engagement Activity 2024
URL https://cloud-span.github.io/nerc-metagenomics00-overview/
 
Description Metagenomics online training course November 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A training course was organised online comprising online lectures, drop in help sessions, and a slack channel for support for 30 participants.

Following completion of this course, learners will be trained to :
explain the hierarchical structure of a file system and describe the files and file structure used in the course
explain what is meant by a working directory, a path and a relative path and write down paths that they will need for the course start a Terminal (Mac) or Git Bash Terminal (Windows)
navigate a file system using the command line log in to and exit their AWS instance (the cloud) use common commands such as ls, pwd and cd, on the command line know the difference between genomics and metagenomics describe the steps in a metagenomic workflow
perform quality control on reads and assemble them into a metagenome perform polishing to improve an assembly use binning to separate the metagenome into different species or MAGs (Metagenome-Assembled Genomes) use Kraken 2 to assign taxonomy to reads and contigs and phyloseq in R to analyse taxonomic diversity

Impact: Upskill postgraduate researchers in bioinformatics.
Year(s) Of Engagement Activity 2023
URL https://cloud-span.github.io/nerc-metagenomics00-overview/
 
Description Metagenomics self-study cohort - April 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact A self-study option was created for a small cohort of 20 learners to study the Metagenomics module, their learning was supported with drop in help sessions and a slack channel.

Following completion of this course, learners will be trained to :
explain the hierarchical structure of a file system and describe the files and file structure used in the course explain what is meant by a working directory, a path and a relative path and write down paths that they will need for the course start a Terminal (Mac) or Git Bash Terminal (Windows) navigate a file system using the command line log in to and exit their AWS instance (the cloud) use common commands such as ls, pwd and cd, on the command line know the difference between genomics and metagenomics describe the steps in a metagenomic workflow perform quality control on reads and assemble them into a metagenome perform polishing to improve an assembly use binning to separate the metagenome into different species or MAGs (Metagenome-Assembled Genomes) use Kraken 2 to assign taxonomy to reads and contigs and phyloseq in R to analyse taxonomic diversity
Year(s) Of Engagement Activity 2023
URL https://cloud-span.github.io/nerc-metagenomics00-overview/
 
Description Online Training Course: Genomics November 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The online training course on Genomics was delivered by the following members of the project team; Emma Rand, Jorge Buenabad-Chavez, Sarah Forrester, Evelyn Greeves, and Annabel Cansdale. The course was delivered over 4 half days to 26 UK-based participants.

Expected learning outcomes - by the end of the training course participants were able to:
• structure their data and metadata and plan for an NGS project
• organise and document genomics data and bioinformatics workflows
• understand what information is needed by a sequencing facility
• gain practice navigating file systems, creating, copying, moving, and removing files and directories
• use command-line tools to assess read quality and perform quality control
• align reads to a reference genome, and identify and visualise sequence variants
• work with Amazon AWS cloud computing and transfer data between a local computer and cloud resources

Feedback from participants was very positive and many stated that they felt their abilities had improved after attending the course, as highlighted in this blog post.

Impact: provides an opportunity for the learner to develop their knowledge and skills.
Year(s) Of Engagement Activity 2021
URL https://cloud-span.github.io/genomics01-intro/
 
Description Online Training Course: Prenomics March 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The online training course on Prenomics was delivered by the following members of the project team; Emma Rand, Jorge Buenabad-Chavez, and Evelyn Greeves. The course was delivered over 2 half days to 28 UK-based participants.

The Prenomics module is designed to prepare people for the Cloud-SPAN Genomics module . We have found that people taking the Genomics module can vary the amount of experience they have had in navigating file systems and using the command line. We have designed the Prenomics module to allow more time for those with less experience to cover some foundation concepts. We have a Self-assessment Quiz to help you decide if you would benefit from attending Prenomics before the Genomics module. The Prenomics and Genomics modules are based on the Data Carpentry's Genomics Workshop. Prenomics teaches the basics of command-line programming, including: (1) file directory structure, (2) use of command-line utilities to connect to and use cloud computing and storage resources and (3) basic shell commands for file navigation and basic script writing.

Impact: allows participants to develop their skills and knowledge in this area.
Year(s) Of Engagement Activity 2022
URL https://cloud-span.github.io/prenomics00-intro/
 
Description Participant on Open Life Science Mentorship Programme 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Team member Evelyn Greeves, is a participant on the OLS mentorship programme. This allows her to widen her expertise in the area of establishing and maintaining an online community.

Impact: via the OLS network the work of Cloud-SPAN can be publicised.
Year(s) Of Engagement Activity 2022
URL https://openlifesci.org/ols-5/projects-participants/
 
Description Participation in an activity, workshop or similar - Online Training Course: Metagenomics - 31 October 2022 - 25 November 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Participation in an activity, workshop or similar - Online Training Course: Metagenomics - 31 October 2022 - 25 November 2022
Year(s) Of Engagement Activity 2022
URL https://cloud-span.github.io/metagenomics00-overview/
 
Description Participation in an activity, workshop or similar - Training Course: Genomics - 6-7 December 2022 - University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Training course for 14 participants from 6 different institutions. Participants completed the interactive workshop and developed practical skills and increased their knowledge in the area of data management and analytical skills for genomic research. All participants connected to the Cloud-SPAN community via the slack channel and in person.
Year(s) Of Engagement Activity 2022
URL https://cloud-span.github.io/00genomics/
 
Description Prenomics online workshop - 22-23 November 2022 - Evelyn Greeves 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Cloud-SPAN is a collaboration between the Department of Biology at the University of York and The Software Sustainability Institute funded by the UKRI innovation scholars award. It aims to train researchers to effectively generate and analyse a range of 'omics data using Cloud computing resources.

This Prenomics module is designed to prepare people for the Cloud-SPAN Genomics module. We have found that people taking the Genomics module can vary the amount of experience they have had in navigating file systems and using the command line. We have designed the Prenomics module to allow more time for those with less experience to cover some foundation concepts. We have a Self-assessment Quiz to help you decide if you would benefit from attending Prenomics before the Genomics module.

Prenomics teaches the basics of command-line programming, including: (1) file directory structure, (2) use of command-line utilities to connect to and use cloud computing and storage resources and (3) basic shell commands for file navigation and basic script writing.

18 participants attended the session.
Year(s) Of Engagement Activity 2022
URL https://cloud-span.github.io/prenomics00-intro/
 
Description Prenomics online workshop - December 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This Prenomics module is designed to prepare people for the Cloud-SPAN Genomics module.

We have found that people taking the Genomics module can vary the amount of experience they have had in navigating file systems and using the command line. We have designed the Prenomics module to allow more time for those with less experience to cover some foundation concepts. We have a Self-assessment Quiz to help you decide if you would benefit from attending Prenomics before the Genomics module.

Prenomics teaches the basics of command-line programming, including: (1) file directory structure, (2) use of command-line utilities to connect to and use cloud computing and storage resources and (3) basic shell commands for file navigation and basic script writing. 18 participants attended the session.
Year(s) Of Engagement Activity 2023
URL https://cloud-span.github.io/prenomics00-intro/
 
Description Presentation at University of York's Head of Department Meeting 2022 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Projects Leads, Emma Rand and James Chong, delivered an informative presentation which covered an overview of the project including; goals, strategy and training resources. This talk generated new registrations for the Prenomics and Genomics training courses and allowed questions to be addressed from the general public.
Year(s) Of Engagement Activity 2022
URL https://drive.google.com/file/d/1pO-DXIR3p8XncrvGxlf5KLBiRPQYJfxp/view?usp=sharing
 
Description Presentation at University of York's Open Day 2021 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact An open day was hosted at the University of York, Emma Rand the Co-Project Lead delivered an informative presentation on the goals of the Cloud-SPAN project. This allowed individuals to ask any questions regarding the project and helped to promote registrations for the Cloud-SPAN activities.
Year(s) Of Engagement Activity 2021
 
Description Presentation: Cloud Span: a use case of FAIR implementation on HPC training materials - Evelyn Greeves, Cloud Span, University of York 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Online workshops: FAIR data in the life sciences: beyond theory 19/10/2022 @ 1:00 pm - 3:30 pm.

Presentation: Cloud Span: a use case of FAIR implementation on HPC training materials - Evelyn Greeves, Cloud Span, University of York
Year(s) Of Engagement Activity 2022
URL https://www.ukrn.org/event/fair-data-life-sciences-oct-2022/
 
Description Presenting a Prenomics poster at the Biology Research Away Day 2022 - Wednesday 7 September 2022 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact At the Research day there were approxiately 100+ people as it was a whole department event.

Aim of the event: Today brings a well-deserved celebration of the excellent research we do and a key opportunity to engage in the great wealth of research diversity within our department. It is this diversity that inspires new collaboration, supports new ideas and promotes exciting initiatives. Each of us deeply appreciates the creative exploration of Bioscience that touches, motivates and drives progress in our fields. Our department heralds its interdisciplinarity with well-established, world-class centres such as CNAP, YSBL, YCCSA and YESI, to newer inspired institutions including YBRI, PoL and LCAB. Our Biosciences Technology Facility is the envy of the Russell Group with award-winning experts championing innovation, training and promoting our scientific potential. Our Research-led Teaching strengthens and builds our department while our dedication to balance and excel at both Research and Teaching consistently ranks us Top 10 in the UK.
Year(s) Of Engagement Activity 2022
URL https://wiki.york.ac.uk/display/DeptBiol/RAD2022
 
Description Research Coding Club, at the University of York, UK - Presentation: You can make an R package too! 18-04-2024 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Presentation to undergrad and postgrad students to teach them to create their own R Package.
Learning objectives included:
Why make a package?
Where packages come from and where do they live?
Package States
How to make a minimal documented package and check it using the devtools(Wickham et al. 2022) approach
Components of a minimal package
Year(s) Of Engagement Activity 2024
URL https://3mmarand.github.io/make-an-r-pkg/minimal-package.html#/title-slide
 
Description Self-study Module - Create your own AWS module 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Cloud-SPAN is a project run by the Department of Biology at the University of York with the aim to training researchers in the experimental design and analysis of 'omics data using cloud-based High Performance Computing (HPC) resources.

This course teaches how to create and manage your own Cloud-SPAN Amazon Web Services (AWS) instance, which is a Linux virtual machine configured with 'omics data and software analysis tools. The instance you will create is the same instance that is used in the Cloud-SPAN courses Prenomics and Genomics.

If you attend tutor-led editions of Cloud-SPAN's Prenomics and Genomics courses you do not need to create your own instance. We will do that for you! But if would like to practice afterwards, or study the courses in your own time, you will need to create an instance first.

You will learn (1) how to open and configure your AWS account, which will enable you to use any AWS service; (2) how to create and manage (start, stop and terminate) your instance; and (3) the cost of using your instance.

The course is designed for 2-3 hours of self-study.
Year(s) Of Engagement Activity 2022,2023
URL https://cloud-span.github.io/create-aws-instance-0-overview/
 
Description Statistically useful experimental design - April 2023 - University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability.

In this half-day workshop different case-studies are examined in detail to discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data you need. Participants also had an opportunity to discuss their own experimental designs and develop their own skills.
Year(s) Of Engagement Activity 2023
URL https://cloud-span.github.io/experimental_design00-overview/
 
Description Training Course: Genomics - January 2024 - Online 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Training course for 17 participants from 12 different institutions. Participants completed the online interactive workshop and developed practical skills and increased their knowledge in the area of data management and analytical skills for genomics research. All participants connected to the Cloud-SPAN community via the slack channel.
Year(s) Of Engagement Activity 2024
URL https://cloud-span.github.io/00genomics/
 
Description Training course: Automated Management of AWS Instances 31 Jan 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact 15 members of the staff from 2 different departments.

Learning objective

Cloud-SPAN is a project run by the Biology Department at the University of York with the aim to training researchers in the experimental design and analysis of 'omics data using cloud-based High Performance Computing (HPC) resources.

This course teaches how to automatically manage multiple Amazon Web Services (AWS) instances - each instance being a Linux virtual machine. Using Bash Shell scripts, it is shown how to create, stop, start and delete one or multiple instances with a single invokation of a script.

We use the scripts to manage multiple instances within the Cloud-SPAN project for hands-on training purposes. When running a workshop, a number of instances is created with relevant 'omics data and software analysis tools that are relevant to the workshop. Each student is granted exclusive access to one instance through the use of an encrypted login key.

The scripts receive as input only the names of the instances to create, delete, etc. Login keys, IP addresses, and domain names used by instances are created on demand on creating the instances, and deleted likewise on deleting the instances. Creating over 30 instances takes 10-15 minutes.

The target audience of the course is anyone in charge of, or interested in, deploying and managing cloud resources. While the course is focused on AWS, and particularly Elactic Compute Cloud (EC2) instances, the scripts can be adapted for use with other cloud providers and other types of cloud services.
Year(s) Of Engagement Activity 2023
URL https://cloud-span.github.io/cloud-admin-guide-0-overview/
 
Description Training course: Statistically useful experimental design - 22 September 2022 - University of York 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Experimental design is critical for 'omics experiments in order to generate data capable of addressing your research questions and control your reagent costs. There are choices to be made about sample preparation and storage, sequencing technologies, the numbers of technical and biological replicates and sequencing depth. The most appropriate choices depend on the type of research question you have, the strengths and weaknesses of the platform and the biological variability.

In this half-day workshop we considered case-studies in detail to discuss some of the most important aspects that need to be taken into account to design and perform experiments that generate the reproducible, high-quality data you need. Participants also had an opportunity to discuss their own experimental designs and develop their own skills.
Year(s) Of Engagement Activity 2022
URL https://cloud-span.github.io/experimental_design00-overview/
 
Description UK Conference for Bioinformatics and Computational Biology talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The UK Conference of Bioinformatics and Computational Biology 2021 brings together biologists, bioinformaticians, computer scientists, software engineers and data scientists across the life sciences, to share innovations, applications and best practice in their fields.
We took part in a workshop session for UKRI Innovation Scholars - Data Science Training in Health and Bioscience, for all the projects awarded as part of this UKRI grant call to hear about the training they are developing in data for life scientists. This session was relevant to those working in life science data who wanted to learn more about the future of training, and was especially relevant to people who already run training in data science in the areas of health and bioscience.
This allowed networking with potential particpaints of Cloud-SPAN training and with those able to publicise and promote the our project
Year(s) Of Engagement Activity 2021
URL https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21#Programme-5
 
Description UKRI DaSH workshop for all grant holders April 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact We have planned a meeting for all grant holders for the DaSH funded projects to assemble at a half day meeting in York. With the following objectives:
- Discuss best practice when running training projects funded on short term grants
- Promote Cloud-SPAN activities
- Network with similar project to understand how activities can be sustained in the long-term
Year(s) Of Engagement Activity 2023
 
Description UKRI Innovation Scholars Data Science Training in Health and Biosciences (DaSH) Community calls organised by Emma Rand for all the DaSH funded projects 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact UKRI Innovation Scholars Data Science Training in Health and Biosciences (DaSH) Community calls organised by Emma Rand for all the DaSH funded projects 06-06-2024, 18-06-2024, 05-07-2024, 15-07-2024, 23-07-2024.

The meetings allowed all PIs on DaSH funding projects to discuss best practice and collaborate on a future grant application for additional funding to support the projects.
Year(s) Of Engagement Activity 2024
 
Description Upgrade of Project Website 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact We enhanced the design and layout of the Cloud-SPAN website and hosted it on GitHub using quarto. The new design will improve user accessibility.
It also enables the following:
1. Promote the Cloud-SPAN project to an international online audience
2. Promote and organise registration of various activities
3. Host News stories and blogs
4. Promote information regarding scholarships
5. Provides a platform for individuals to ask for further information or ask any questions

Impact: enables the dissemination of the project details to a wider audience and generates registrations to current activities
Year(s) Of Engagement Activity 2023
URL https://cloud-span.york.ac.uk/