A UK-Africa Data Science Network: Capturing the SKA-Driven Data Transformation
Lead Research Organisation:
University of Manchester
Department Name: Physics and Astronomy
Abstract
The programme will build a multi-institute big data research and training platform, between the South African and UK partner universities, that will establish sustainable links via a Data Science Network, led by academics and developed and refined in consultation with the user community. eResearch capacity is an absolute necessity in a world where the appropriate handling of big data, in the natural sciences, medicine, the humanities and the social sciences is of paramount importance. This program will support the development of research capacity in Big Data and Data Science in South Africa through the creation of a joint UK-SA training network which will form the basis of a long term sustainable research collaboration that has the potential to address issues of global concern.
The program is designed to create a UK-SA Data Science Network to advance research and training in big data science. As a core tool to support the network and its programs we will develop and deploy an online portal gateway. Such cyber-infrastructure can be used both as a direct teaching resource, hosting MOOCs and other online material, as well as a research platform for data science and big data, hosting a data portal and collaborative work space. The program does not intend to build physical infrastructure, but will utilise capacity at existing facilities developed for data intensive research in conjunction with the SKA project in Africa.
The program is designed to create a UK-SA Data Science Network to advance research and training in big data science. As a core tool to support the network and its programs we will develop and deploy an online portal gateway. Such cyber-infrastructure can be used both as a direct teaching resource, hosting MOOCs and other online material, as well as a research platform for data science and big data, hosting a data portal and collaborative work space. The program does not intend to build physical infrastructure, but will utilise capacity at existing facilities developed for data intensive research in conjunction with the SKA project in Africa.
Planned Impact
The economic impact of training programs such as this is often found primarily in providing a skilled work force for an existing economy in order to grow that sector. The big data analytics economy is still emerging and the cohort of students trained by this program will be expected to contribute significantly to securing South Africa's future market share in this area. Impacts will be found primarily in three areas:
People: By establishing a joint UK-SA eResearch infrastructure for Big Data & Data Science we will improve science and innovation expertise (i.e. capacity building). We will do this using student and researcher fellowships which include mobility schemes and joint training programs.
Research: By building on newly established scientific infrastructure we will develop an innovative research program that utilises techniques drawn from the fundamental scientific research surrounding the SKA project in order to expand the impact of those techniques into other domains. We will establish an innovative cross-disciplinary Data Science Network platform that can accelerate and enable data science innovation across multiple fields, both academic and non-academic, in SSA.
Translation: We will target the expansion of data analysis, visualisation and management techniques necessary for the SKA into other domains. We will achieve this by partnering SKA data science projects with non-SKA data science projects under three common research themes: data visualisation, data analytics, data visualisation and data systems & tools. This innovative approach will enable a parallel development of big data and data science techniques across multiple domains and allow us to use progress in the SKA project to develop innovative solutions on development topics outside the remit of SKA.
People: By establishing a joint UK-SA eResearch infrastructure for Big Data & Data Science we will improve science and innovation expertise (i.e. capacity building). We will do this using student and researcher fellowships which include mobility schemes and joint training programs.
Research: By building on newly established scientific infrastructure we will develop an innovative research program that utilises techniques drawn from the fundamental scientific research surrounding the SKA project in order to expand the impact of those techniques into other domains. We will establish an innovative cross-disciplinary Data Science Network platform that can accelerate and enable data science innovation across multiple fields, both academic and non-academic, in SSA.
Translation: We will target the expansion of data analysis, visualisation and management techniques necessary for the SKA into other domains. We will achieve this by partnering SKA data science projects with non-SKA data science projects under three common research themes: data visualisation, data analytics, data visualisation and data systems & tools. This innovative approach will enable a parallel development of big data and data science techniques across multiple domains and allow us to use progress in the SKA project to develop innovative solutions on development topics outside the remit of SKA.
Publications

Amugongo L
(2019)
PO-0932 Identification of modes of tumour changes in NSCLC during radiotherapy
in Radiotherapy and Oncology

Barrett A
(2020)
Forecasting vegetation condition for drought early warning systems in pastoral communities in Kenya
in Remote Sensing of Environment

Barrett Adam B.
(2019)
Forecasting vegetation condition for drought early warning systems in pastoral communities in Kenya
in arXiv e-prints

Bowles M
(2021)
Attention-gating for improved radio galaxy classification
in Monthly Notices of the Royal Astronomical Society

Hosenie Z
(2019)
Comparing Multiclass, Binary, and Hierarchical Machine Learning Classification schemes for variable stars
in Monthly Notices of the Royal Astronomical Society

Hosenie Zafiirah
(2020)
Imbalance Learning for Variable Star Classification
in arXiv e-prints

Simon Ndiritu
(2020)
IMPUTING MISSING DATA FOR POLARIZATION MEASUREMENTS

Vafaei Sadr A
(2019)
DeepSource : point source detection using deep learning
in Monthly Notices of the Royal Astronomical Society
Description | IAU OAD |
Organisation | International Astronomical Union |
Country | France |
Sector | Learned Society |
PI Contribution | DARA Big Data OAD fellowships scheme hosted by the IAU OAD in Cape Town to develop hackathon resources |
Collaborator Contribution | Host for fellowships |
Impact | This collaboration covers astronomy, agriculture and health. |
Start Year | 2019 |
Description | Big Data Africa |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Big Data Africa - postgraduate training school in machine learning, data science and analysis. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.ska.ac.za/students/big-data-africa-summer-school/ |
Description | Big Data Africa 2019 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Big Data Africa summer school : https://www.sarao.ac.za/students/young-professionals-development-programme-2/ |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.darabigdata.com/big-data-africa-2019 |
Description | CODATA VizAfrica Gaborone 2019 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Undergraduate students |
Results and Impact | DARA Big Data supported a week long training school for students as part of VizAfrica 2019 in Gaborone, Botswana. This included python tutorials and astronomy coding tutorials. |
Year(s) Of Engagement Activity | 2019 |
URL | https://vizafrica.codata.org |
Description | DARA Big Data Hackathon Windhoek 2019 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | Astronomy hackathon at the Namibia University of Science and Technology |
Year(s) Of Engagement Activity | 2019 |
URL | https://github.com/darabigdata/WindhoekHack |
Description | Fanaroff Lecture 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Undergraduate students |
Results and Impact | DARA Big Data ran the Fanaroff Lecture 2020, a lecture on science communication for policy engagement aimed at early career scientists |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.eventbrite.co.uk/e/fanaroff-lecture-2020-tickets-91803078479 |
Description | IDW2018 Hackathon |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Two day hackathon event to accompany International Data Week 2018 in Gaborone, Botswana. |
Year(s) Of Engagement Activity | 2018 |
URL | https://github.com/darabigdata/IDWBotswana |
Description | JEDI Madagascar |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | 25 students attended a training school on machine learning, data science and data analytics in Madagascar. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.idia.ac.za/workshop/jedi-madagascar |
Description | Science for Development Cape Town 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | DARA Big Data sponsored and participated in the science for development workshop at the IAU OAD in Cape Town |
Year(s) Of Engagement Activity | 2020 |
URL | http://science4dev.astro4dev.org |