Enabling the big data revolution through skills training

Lead Research Organisation: King's College London
Department Name: Genetics and Molecular Medicine

Abstract

Tremendous amounts of data are produced every day in the life sciences and biomedicine and made available to the wider biological research community in the form of databases, data records and digitised information. The heterogeneity of these data, and at the same time, the potential amount of information that they can give us is astoundingly complex. We are at a crucial moment in biomedical research for which the effort of multiple disciplines in the quantitative and in the biomedical sciences have to come together to rationalise, quantify and extract the essential information content that can be translated into practical use in the medical context.
We present a flexible training programme to facilitate the understanding of the data landscape that is populating modern medicine. More accurate and informed diagnoses will be possible if all the involved parties are able to extract useful information from patient data records and if these are efficiently integrated with genetic and analytical investigations with the aim of designing personalised therapies. The program is aimed at a large set of trainees: from medical practitioners, clinicians, scientists, companies and workers in the health sector. The courses will focus on skills training of different complexity that can be assembled in a personalised modular fashion.
The flexibility is in the opportunity to pick and mix courses to generate learning curricula of different depth levels that can be started at any point.
The offered courses will range from data exploration, integration and manipulation to more in depth analyses via computational statistical and artificial intelligence (AI) based methods. The trainees will have the opportunity to participate in the assembly of computational pipelines to analyse the data, and to bring to the table their own data for collaborative analyses.

We have designed the training programme centred around three pillars (workstreams) that we believe are among our strengths in terms of training expertise, data collection and method development: Health Data Science exploring electronic data records (WS1); 'Omics harnessing genetics and molecular data collected in online databases (WS2); Artificial Intelligence focusing data image analysis and understanding AI through practical applications (WS3). The cross-talk between these areas of research is only at the beginning, this programme should facilitate collaborative efforts in identifying and overcoming the barriers for effective integration and translation across disciplines.
The programme will engage the supporters, the patients and the public in workshops led by the participants sharing their learning experience and feedback suggestions for the structure and contents of the thought material.

The program has three levels of governance: A) a management committee of PI, co-I's and WS leads; B) a stakeholder committee of representatives from academic research including ECRs, mid-career and senior leaders, clinical trainees and clinical academics and industry-based trainees; C) an advisory group with representatives invited from the funder and project partners to feedback information in a loop from which the program will continuously learn and improve.

Technical Summary

Biomedicine research is at the crossroads of a transformative change if an effective use of all the data gathered in the field can be effectively rationalised and integrated. At present, data are stored in separate silos that are specific to a research field and its culture. It is becoming apparent that cross-disciplinary training encompassing these cultures is the best way forward to connect researchers and to integrate data from those silos.
Non-standardised data formats and lack of generic data pipelines are hampering effective integration and translation across disciplines and therefore limiting scientific cross-communication.
We present a training programme with the ambitious task of bridging the gap between real-world clinical care data, genetics, molecular and imaging data with computational statistical modelling and Artificial Intelligence. We believe we have excellent research portfolio in these areas and proven success in a number of active teaching activities, but we see the opportunity of integrating these in a truly interdisciplinary programme that is flexible and modular and allows for the design of personalised training pathways that can be initiated at any time.
The programme will have a circular structure with no specific entry point and with the possibility of pick and mix courses from the three areas: Health Data Science exploring electronic data records (WS1); 'Omics harnessing genetics and molecular data collected in online databases (WS2); Artificial Intelligence focusing data image analysis and understanding AI through practical applications (WS3).
The courses will cover the understanding of online 'omics data resources, harnessing the heterogeneity of health informatics and imaging records by interacting with the data at different levels: From the implementation of simple pipelines to analyse data, to the mastering of statistical analyses, to clinical prediction modelling, machine learning and artificial intelligence data analyses.

Publications

10 25 50
 
Description Basic R with Data Carpentry online self paced module
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact People using this training have signed from health care providers and higher education institutes, with 40% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. 86% who have completed the feedback said they will be able to apply their learning to their work and/or research.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1213c
 
Description Demystifying AI Online learning module.
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact People using this training have signed from health care providers and higher education institutes, with 60% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. All who have completed the feedback said they will be able to apply their learning to their work and/or research. This training module has also been embedded into the HEE AI fellows curriculum, and is used for flipped learning. Allowing the fellows and the programme to take advantage of the resource already created.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1215c
 
Description Demystifying AI training - AI Pillar
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact Through completing this training healthcare professionals are introduced to the concept and application of AI and machine learning in healthcare. As this technology is being used more regularly within healthcare increasing access to training and introducing more healthcare professionals to the applications is key to ensure the health care sector can keep up with these changes. Feedback from one participant was that: "I really enjoyed the opportunity to better understand the use of AI in healthcare and how it may be applied to research"
 
Description Introduction to Python for Health Research Self paced module
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact People using this training have signed from health care providers and higher education institutes, with 60% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. 66% who have completed the feedback said they will be able to apply their learning to their work and/or research.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1223c
 
Description Introduction to R for health research Online training modules.
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact People using this training have signed from health care providers and higher education institutes, with 40% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. All who have completed the feedback said they will be able to apply their learning to their work and/or research.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1214c
 
Description Introduction to R training - 'Omics Plillar
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact Participants of this training reported that would use R more in their work - both within health care and within higher education. This is key as the healthcare workforce is being transformed by an increase in data collection and usage. Having beginner level training is key to help introduce those currently not using a range of data platforms like R being to feel more confident and start applying the use to their work and research.
URL https://innovationscholars.co.uk/training/pillar-two-omics/
 
Description Statistics with R online learning module
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact People using this training have signed from health care providers and higher education institutes, with 40% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. All who have completed the feedback said they will be able to apply their learning to their work and/or research. This module has been used as part of a flipped curriculum for the King' College London MRC DTP and the Wellcome DTP.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1220c
 
Description Statistics with R training - 'Omics Pillar
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact Ensuring that the healthcare sector are equipped to deal with the increasing data which is being collected and used in healthcare. Ensuring the healthcare workforce are exposed to using a variety of data platforms including R and statistics. This reached individuals within London based trusts and HEIs as well as participants from Loughborough and Wales. One participants commented the following: "Excellent course, with blended delivery making it much more accessible to those with caring responsibilities. Thank you!"
 
Description UK Health Data Research Alliance Paper "Principles and Best Practices for Trusted Research Environments"
Geographic Reach Europe 
Policy Influence Type Contribution to new or Improved professional practice
Impact Beyond the forward of the paper, on 3rd March NHS England and NHS Improvement, Department of Health & Social Care and Department for Business, Energy & Industrial Strategy announced £200m investment in NHS data infrastructure to help put the NHS in the driving seat of data-driven research and innovation, a significant proportion of which would be to deploy and expand NHS 'Trusted Research Environments' (TREs) - a type of 'Secure Data Environment' which will enable the NHS to make life-saving linked data more securely and quickly accessible to researchers, while offering the highest levels of privacy.
URL https://www.england.nhs.uk/blog/collaboration-across-the-system-to-increase-the-privacy-protection-a...
 
Description Using Spreadsheets for recording data and metadata online learning module
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact The feedback on this training has mentioned it has been particularly helpful for the healthcare workforce who are working on audits or similar yet have not have training in research and/or data collection. To be able to use excel more efficiently has been beneficial. People using this training have signed from health care providers and higher education institutes, with 50% being health care providers. The geographic reach has seen sign ups from all over the world with the majority based in the UK. All who have completed the feedback said they will be able to apply their learning to their work and/or research.
URL https://learninghub.kingshealthpartners.org/product?catalog=khp1225c
 
Description Using spreadsheets for recording data and metadata training - 'Omics Plillar
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact Participants of this training programme were most based within London, from various NHS trusts although 75% of the participants were from a non clinical background. It is key that training in data science is provided to all including the healthcare workforce which is being transformed by an increase in data collection and usage. Feedback from one participant stated that: "The course was really good for a excel beginner."
 
Description Elixir-UK 
Organisation ELIXIR
Department ELIXIR UK
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Through networks made as a result of this training grant, team members Franca Fraternali, Vasa Curcin and Emily Robinson were able to allow Kings College London to become to the 21st member of the Elixir UK Network. Through joining this collaboration Innovation Scholars Training will be shared to the other 20 members. Kings College London will be able to be part of key discussions within life sciences data for both academia and industry and have influence over this international community.
Collaborator Contribution Elixir UK is the UK node of the European network of the same name WHICH unites organisations across Europe working in life science data. It coordinates, integrates and sustains bioinformatics resources across its member states and enables users in academia and industry to access services that are vital for their research. This will ensure that KCL staff and students will have access to this training, services and research collaborations.
Impact Franca Fraternali sits on the Elixir UK steering committee.
Start Year 2021
 
Description HDRUK and Innovation Scholars 
Organisation Health Data Research UK
Country United Kingdom 
Sector Private 
PI Contribution We have designed a new module - Introduction to Data Science for HealthCare professionals. Our contribution is in: - curriculum design - Writing the training content - development of video animation.
Collaborator Contribution HDRUK Futures team are supporting the development of this module through - writing content - feedback and curriculum development.
Impact We are working with HDRUK futures to develop further training aimed at the Health Care workforce. By working together our training will bring teaching from experts and be targeted at those working in healthcare. The training is still in development and will launch before October 2023.
Start Year 2023
 
Description Academic Foundation Year 1 lecture for King's Health Partners 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Academic Foundation Year 1 trainees were introduced to big data in medicine and the potential impacts they can have on diagnostics and patient care. The doctors were given the resources and encouraged to undertake the training on the innovation scholars program ahead of their research block next year in the the academic foundation year 2.
Year(s) Of Engagement Activity 2023
URL https://www.guysandstthomaseducation.com/project/foundation/
 
Description Governance panel for the Innovation Scholars Programme 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact A governance panel formed of academics within Kings College London, Professional service leads and industry partners (Google Health, NVIDIA) has been formed to meet twice a year. The role of the governance committee is to obtain a longer term view of the grant activities and their relative success. They will help the training team to provide advice and readjust targets/goals as required.
Year(s) Of Engagement Activity 2021,2022
 
Description King's Academy Lightning Lunch 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact This short talk was presented online to university staff interested in unique and novel training and teaching development. There was roughly 25 people in the audience. The focus of the talk was on the challenges in developing online training in Data science that still engaged participants, allowed for tutor less progression and automated marking of assignments.
Year(s) Of Engagement Activity 2023
 
Description NHS R webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact We presented at the monthly NHS-R community webinar to staff in the NHS who have interest in using R programming. There was roughly 50 participants who watched the webinar live and the resulting YouTube video of the recording has been viewed over 130 times.
Year(s) Of Engagement Activity 2022
URL https://www.youtube.com/watch?v=HeGpHy2naJY
 
Description NIHR ARC South London talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact During this talk the Innovation scholars programme was introduced to the NIHR Applied Research Collaboration team. This included academic leaders and patient and public representatives. This was to share training opportunities that researchers and NHS staff could sign up for and also discuss how to include PPIE within the development of the training modules.
Actions were taken forward to meet with other NHS trusts and further discussions with PPIE groups going forward.
Year(s) Of Engagement Activity 2021
 
Description Speaking to the South London and Maudsley NHS Foundation Trust 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact We spoke to the South London and Maudsley NHS Foundation Trust Deming Design authority which is formed of a range of analysts and data users across SLaM. This was to share data training opportunities and to understand the need from this group of people. It resulted in further discussions with individuals for specific feedback as well as an introduction with NHS-R and the development of a relationship with this key group.
Year(s) Of Engagement Activity 2022
 
Description Stakeholder panel for the Innovation Scholars programme at KCL 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact A stakeholder panel made up of prospective users of the Innovation Scholars Training programme from across higher education and the NHS was formed. This meets twice a year for feedback and discussion on our training modules. The logistics of training to best suit the future participants as well as content is discussed and changes made based on discussions.
Year(s) Of Engagement Activity 2021,2022
 
Description UK Conference of Bioinformatics and Computational Biology 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The UK Conference for Bioinformatics and Computational Biology brings together researchers, software developers and data managers working across the life sciences to share ideas, discoveries, tools and best practice in computational methods. DaSH award holders were invited to speak to share updates in their training programme. Challenges and successes were shared between grant holders. There was also excellent exposure to our training audiences about the training available.
Year(s) Of Engagement Activity 2021
URL https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21#Programme-5