SA-DISCNet: A collaborative data science training network across southern Africa and southern UK

Lead Research Organisation: University of Portsmouth
Department Name: Institute of Cosmology and Gravitation

Abstract

Data intensive science is a major global growth area, as the volume, complexity and rate of digital data within governments and companies continues to rapidly increase. At the same time, powerful analysis techniques continue to evolve for obtaining radical insights into large datasets, including finding clusters and anomalies, as well as detecting and predicting dominant trends and correlations in such data. This data intensive science comes at a crucial time for global development. Major worldwide challenges, as encapsulated in the United Nations' Sustainable Development Goals (SDGs), require multidisciplinary solutions, many of which include data science. Moreover, the South African National Development Plan (NDP) for 2030 recognises the need to "sharpen its innovative edge and continue contributing to global scientific and technological advancement" and "shift to a more knowledge-intensive economy".

We therefore propose to build a training network in data intensive science between universities in southern UK and partners in southern Africa to help address these SDGs and NDP priorities. The cornerstones of this network will be the `Data Intensive Science Centre in SEPnet' (DISCnet) and the African Institute of Mathematical Science (AIMS) South Africa. Together, we will pilot an innovative course of training and internships for the next generation of data analysts, focusing on solving SDG-related questions in South Africa and acting as a driver of the country's economy in the 21st century.

Our aim with this pilot training programme is to equip and send students to solve data science problems associated with sustainable development goals (SDGs) in SA and beyond. The specific goals of the pilot programme are to: (i) Deliver an initial cohort of at least 10 highly trained African data scientists; (ii) Provide a world-class data science school to African students, leveraging existing DISCnet training material; (iii) Prime-pump a new 8-week hand-on data science training course at AIMS with contributions from DISCnet; (iv) Contribute to the sustainable development goals via 3 month strategic student internships with South African organisations and companies, focusing on economic development and welfare; (v) Understand the details of managing an extended, sustainable training network across southern Africa.

This pilot leverages considerable investment from STFC, our university partners, and the Royal Society (RS). Our long-term ambition is to create a sustainable network of comparable scale to DISCnet, e.g. approximately 25 African STEM students per year receiving our specialist training. These students will become the future data science leaders in Africa.

Planned Impact

There are several groups who will benefit from the impact of this project:

a) Students from South Africa and other African nations will benefit by being trained in data intensive science through our school and workshop, and will have their training put into practice by participating in our internship programme, where they will solve real-world problems associated with sustainable development goals. Such trained students will be highly sought-after for data science positions in Southern Africa, e.g. we would hope that some of our students will return to their host companies to continue their careers. Overall, we expect them to become leaders in data science in the future.

b) Companies and organisations in South Africa who participate in our internment programme will gain new insights into their problems and data, with the aim of improving infrastructure, equal opportunities and innovation in South Africa in line with the SDGs.

c) We will strengthen ties between AIMS and DISCnet for data science-related training. Our SA proposers will gain from the large range of transferable training materials from the DISCnet courses.

d) DISCnet will learn from the good practice of AIMS in training students from diverse backgrounds in data science. DISCnet will also benefit from the experience of the existing AIMS internship programme.

e) Everyone involved will benefit from increased collaborations in data intensive science, both for academic studies (e.g. SKA and LSST) and applications to real-world problems (innovation and SDGs).

f) We will extend our programme to include larger student cohorts, more internships, and in nodes across Africa in future years (Cameroon, Ghana, Rwanda, Senegal and Tanzania). Through demonstrating the value and excitement of our programme via this pilot, we will seek further funding from GCRF and other UK and SA opportunites.

Publications

10 25 50
 
Description The key outcomes of this partnership are as follows:

- A 3 day training school for students from across Africa, providing training in Data Science;

- An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships;

- 11 internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.

ODA relevance: There is a great need for trained workers in Data Intensive Science in African countries; these applied scientists can provide (for instance) strategies for city organisation, new insights into health care, analysis techniques for local start-up businesses, and hence will contribute to economic growth. Our programme has trained a cohort of African data scientists (all from countries on the DAC list) in both theoretical and practical skills in data intensive science and machine learning. We expect that the trained participants will use their much-needed skills to become data science leaders in South Africa and/or their home countries, building data science capacity and continuing to contribute to the Sustainable Development Goals. Over the long-term, our participants will drive cultural change, improving the understanding of the benefits of data science across the region in companies and NGOs. The principal ODA beneficiary country is South Africa, in which AIMS (Cape Town) has led efforts for the school, workshop and internship programme. The South African National Development Plan (NDP) has recognised the importance for the region to be more engaged in the knowledge-intensive economy; this is the issue which our programme has addressed. Data science is set to be a key tool throughout Africa for finding multidisciplinary solutions to many societal and economic issues.
Exploitation Route The African Institute of Mathematical Studies is planning to hold future Data Intensive Science schools, including one this summer. This will lead to new cohorts of students from across Africa being trained in data science, creating capacity and knowledge of the discipline in DAC countries.
Sectors Digital/Communication/Information Technologies (including Software),Education,Transport

 
Description 40 graduate students attended our training school, all from African countries on the DAC list (14 different nationalities). 16 participants attended our 8-week data science intensive programme, from 7 African nationalities on the DAC list. Modules included collaborative projects and training in several areas: building machine learning algorithms to predict real-world house prices and to detect pneumonia in chest radiography; image classification with convolutional neural networks; and deep learning for natural language processing tasks. 11 participants are in our internship programme, solving problems related to UN sustainable development goals with local organisations and companies. 2 of the interns were from the UK (University of Portsmouth and Southampton); these participants travelled to South Africa to engage in research for Cape Town transport department. They were able to use the skills learned in UK DISCnet to provide construct detailed models of the Cape Town population with a view to improving the transport network. We are committed to promoting gender equality in the area of Data Intensive Science. Of the attendees at the initial training school, 50% were female. Of the attendees at the 8 week conference 31% were female. At these events, 4 of the lecturers were male and 3 were female. Zoona, one of our internship companies, was identified as one of 10 companies worldwide best suited to lifting women and girls out of poverty, and was therefore invited to become a part of the prestigious Girl Effect Accelerator programme promoting gender equality. Data science is set to be a key tool throughout Africa for finding multidisciplinary solutions to many societal and economic issues. In this project we have focused on UN Sustainable Development Goals 3, 4, 8, 9, 11, 17, including in the internships. Benefits will include optimised logistics for infrastructure projects, and quantitative planning for startup companies. SDG3: Our interns working with University of Pretoria have made initial developments for malaria detection via machine learning for a hand scanner; SDG4: Our training school and DSI programme aim to provide education and skills to build leadership in data science within DAC countries; internship with Siyavula seeks to build automated training for South African high school students; SDG8: Our internship with Zoona supports their work on payments for entrepreneurs and small businesses in African DAC countries; internship with Grailabs provides AI services to SA industries. SDG9 and 11: Our interns working with Cape Town transport department (TDA) have analysed city population and travel behaviour in order to improve commuting strategies for low income commuters. Their initial work will lead to an ongoing collaboration between AIMS and TDA. SDG17: we have built a useful collaboration between DISCnet UK and AIMS, together with local South African businesses and organisations, building capacity in Data Science for students from African DAC countries.
First Year Of Impact 2019
Sector Digital/Communication/Information Technologies (including Software),Education,Transport
Impact Types Societal,Economic,Policy & public services

 
Description Data Intensive Science training
Geographic Reach Africa 
Policy Influence Type Influenced training of practitioners or researchers
Impact We have provided training and experience in Data Intensive Science for graduate students from African countries. We have held a data science school, an 8-week immersive workshop, and placements with local organisations and companies working on SDG-related projects. Our initial school has trained 40 graduate students from African DAC countries in data science. Students are now applying their training in areas including the financial market, water treatment, and pattern recognition. Feedback is very positive, with 100% keen to recommend the school to colleagues, and 100% wanting a follow-up school. Our 8-week intensive workshop has invested in 16 students from African DAC countries to prepare them to be data science leaders. The students have benefited from coaching by experts from AIMS (Cape Town), SEPnet (UK), the ESRC Business and Data Research Centre (U. Essex) and Netflix. Internships have included work with the University of Pretoria, Cape Town Transport department, Zoona, Siyavula, Naspers, Conversion Science, Grailabs and RLabs. DSI participants are now being offered opportunities in data science careers, including: two Square Kilometre Array research fellowships; a permanent position at RLabs; a part-time position at Conversion Science; and an internship with Data Prophet.
 
Description STFC ODA institutional Award
Amount £15,000 (GBP)
Organisation Science and Technologies Facilities Council (STFC) 
Sector Public
Country United Kingdom
Start 01/2018 
End 12/2018
 
Description Collaboration between DISCnet and AIMS 
Organisation African Institute for Mathematical Sciences
Country South Africa 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Collaboration between DISCnet and AIMS 
Organisation Open University
Country United Kingdom 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Collaboration between DISCnet and AIMS 
Organisation Queen Mary University of London
Country United Kingdom 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Collaboration between DISCnet and AIMS 
Organisation University of Essex
Country United Kingdom 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Collaboration between DISCnet and AIMS 
Organisation University of Southampton
Country United Kingdom 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Collaboration between DISCnet and AIMS 
Organisation University of Sussex
Country United Kingdom 
Sector Academic/University 
PI Contribution I have acted as Principal Investigator for the SA-DISCnet collaboration. This collaboration was included as the focus of our grant application, linking the UK Universities participating in the STFC-funded DISCnet doctoral training scheme with the African Institute for Mathematical Sciences (AIMS) in Cape Town, South Africa. My contribution has included managing the whole programme of training students from across Africa in Data Science, including the initial training school, an 8 week intensive data science training programme, and internships with companies in Cape Town. I also acted as a lecturer for the initial school, teaching advanced statistics for data science applications.
Collaborator Contribution The African Institute of Mathematical Studies has acted as local organiser for the programme. They have hosted the initial training school, communicating with participants beforehand, providing the lecture theatre, organising meals, and providing a computer lab for daily practicals. They also provided one of the lecturers (Michelle Lochner) teaching Machine Learning, together with tutors for the practicals. AIMS were local organisers of the 8 week Data Science Intensive programme, organising venue and communicating with participants. Professor Bruce Bassett from AIMS contributed to the training throughout the DSI programme. AIMS have also acted as local organisers of the internship programme, communicating with participants and internship organisations. The University of Essex provided two lecturers for the DSI programme (Haider Raza and Ana Matran-Fernandez), and QMUL provided one lecturer (Alkistis Pourtsidou). University of Southampton, University of Sussex and Open University have contributed to proposal writing and management of the project.
Impact The key outcomes of this partnership are as follows: - A 3 day training school for students from across Africa, providing training in Data Science; - An 8 week intensive training programme for students across Africa, enabling practical training and experience in solving data science problems, preparing these participants for internships; - Internships with organisations and businesses in South Africa, applying data science to problems associated with UN sustainable development goals.
Start Year 2018
 
Description Internships with businesses and organisations in South Africa 
Organisation African Institute for Mathematical Sciences
Country South Africa 
Sector Academic/University 
PI Contribution We have provided 11 internships for 9 participants in our DSI programme, all from African DAC countries, together with 2 students from the UK; the interns have worked with businesses and organisations in Cape Town to carry out work in data science focussed on problems associated with sustainable development goals. SA-DISCnet has selected participants and found internship organisations, have organised travel and subsistence for the participants, and continued to communicate with the participants to check progress.
Collaborator Contribution The businesses and organisations have engaged with the interns to explain their current data science challenges, and have interacted with the interns regularly to discuss progress. Four representatives from these organisations attended our internship report day; two provided talks during the Data Intensive Science training programme.
Impact Internships have included work with: the University of Pretoria to deliver a machine learning algorithm to detect malaria with good accuracy from a hand scanner; Cape Town Transport department analysing city population and travel behaviour for low income commuters; Zoona, a payments company partnering with small emerging entrepreneurs and small businesses in Zambia, Malawi and Mozambique; Siyavula, an innovative edutech startup that offers SA high school students personalised automated training; Naspers, an internet and media group; Conversion Science, a search engine marketing and analytics agency; Grailabs, providing machine learning and artificial intelligence services to industry in SA; and RLabs, a Cape Town non-profit aiming to bring social change through information and communication technologies. AIMS will be continuing its association with Cape Town transport, with plans for further internships and joint research projects. AIMS have also created a new industry contact with Conversion Science, who has offered a paid internship to an AIMS Masters student.
Start Year 2018
 
Description Internships with businesses and organisations in South Africa 
Organisation Zoona
Country South Africa 
Sector Private 
PI Contribution We have provided 11 internships for 9 participants in our DSI programme, all from African DAC countries, together with 2 students from the UK; the interns have worked with businesses and organisations in Cape Town to carry out work in data science focussed on problems associated with sustainable development goals. SA-DISCnet has selected participants and found internship organisations, have organised travel and subsistence for the participants, and continued to communicate with the participants to check progress.
Collaborator Contribution The businesses and organisations have engaged with the interns to explain their current data science challenges, and have interacted with the interns regularly to discuss progress. Four representatives from these organisations attended our internship report day; two provided talks during the Data Intensive Science training programme.
Impact Internships have included work with: the University of Pretoria to deliver a machine learning algorithm to detect malaria with good accuracy from a hand scanner; Cape Town Transport department analysing city population and travel behaviour for low income commuters; Zoona, a payments company partnering with small emerging entrepreneurs and small businesses in Zambia, Malawi and Mozambique; Siyavula, an innovative edutech startup that offers SA high school students personalised automated training; Naspers, an internet and media group; Conversion Science, a search engine marketing and analytics agency; Grailabs, providing machine learning and artificial intelligence services to industry in SA; and RLabs, a Cape Town non-profit aiming to bring social change through information and communication technologies. AIMS will be continuing its association with Cape Town transport, with plans for further internships and joint research projects. AIMS have also created a new industry contact with Conversion Science, who has offered a paid internship to an AIMS Masters student.
Start Year 2018
 
Description Businesses visit 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact On February 16th 2019, we held a workshop where all 9 African internship participants presented the results of their placements so far. Four industry representatives were invited to hear about the results, and then all attendees engaged in discussion about the future landscape of Artificial Intelligence and its impact on Africa. All students and business representatives were from African countries on the DAC list.
Year(s) Of Engagement Activity 2019