EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning
Lead Research Organisation:
Imperial College London
Department Name: Dept of Mathematics
Abstract
The CDT will train the next generation of leaders in statistics and statistical machine learning, who will be able to develop widely-applicable novel methodology and theory, as well as create application-specific methods, leading to breakthroughs in real-world problems in government, medicine, industry and science. The research will focus on the development of applicable modern statistical theory and methods as well as on the underpinnings of statistical machine learning. The research will be strongly linked to applications.
There is an urgent national need for graduates from this CDT. Large volumes of complicated data are now routinely collected in all sectors of society, encompassing electronic health records, massive scientific datasets, governmental data, and data collected through the advent of the digital economy. The underpinning techniques for exploiting these data come from statistics and machine learning. Exploiting such data is crucial for future UK prosperity. However, several reports from government and learned societies have identified a lack of individuals able to exploit this data.
In many situations, existing methodology is insufficient. Off-the-shelf approaches may be misleading due to a lack of reproducibility or sampling biases which they do not correct. Furthermore, understanding the underlying mechanisms is often desired: scientifically valid, interpretable and reproducible results are needed to understand scientific phenomena and to justify decisions, particularly those affecting individuals. Bespoke, model-based statistical methods are needed, that may need to be blended with statistical machine learning approaches to deal with large data. Individuals that can fulfill these more sophisticated demands are doctoral level graduates in statistics who are well versed in the foundations of machine learning. Yet the UK only graduates a small number of statistics PhDs per year, and many of these graduates will not have been exposed to machine learning.
The Centre will bring together Imperial and Oxford, two top statistics groups, as equal partners, offering an exceptional training environment and the direct involvement of absolute research leaders in their fields. The supervisor pool will include outstanding researchers in statistical methodology and theory as well as in statistical machine learning.
We will use innovative and student-led teaching, focussing on PhD-level training. Teaching cuts across years and thus creates strong cohort cohesion not just within a year group but also between year groups. We will link theoretical advances to application areas through partner interactions as well as through a placement of students with users of statistics.
The CDT has a large number of high profile partners that helped shape our application priority areas (digital economy, medicine, engineering, public health, science) and that will co-fund and co-supervise PhD students, as well as co-deliver teaching elements.
There is an urgent national need for graduates from this CDT. Large volumes of complicated data are now routinely collected in all sectors of society, encompassing electronic health records, massive scientific datasets, governmental data, and data collected through the advent of the digital economy. The underpinning techniques for exploiting these data come from statistics and machine learning. Exploiting such data is crucial for future UK prosperity. However, several reports from government and learned societies have identified a lack of individuals able to exploit this data.
In many situations, existing methodology is insufficient. Off-the-shelf approaches may be misleading due to a lack of reproducibility or sampling biases which they do not correct. Furthermore, understanding the underlying mechanisms is often desired: scientifically valid, interpretable and reproducible results are needed to understand scientific phenomena and to justify decisions, particularly those affecting individuals. Bespoke, model-based statistical methods are needed, that may need to be blended with statistical machine learning approaches to deal with large data. Individuals that can fulfill these more sophisticated demands are doctoral level graduates in statistics who are well versed in the foundations of machine learning. Yet the UK only graduates a small number of statistics PhDs per year, and many of these graduates will not have been exposed to machine learning.
The Centre will bring together Imperial and Oxford, two top statistics groups, as equal partners, offering an exceptional training environment and the direct involvement of absolute research leaders in their fields. The supervisor pool will include outstanding researchers in statistical methodology and theory as well as in statistical machine learning.
We will use innovative and student-led teaching, focussing on PhD-level training. Teaching cuts across years and thus creates strong cohort cohesion not just within a year group but also between year groups. We will link theoretical advances to application areas through partner interactions as well as through a placement of students with users of statistics.
The CDT has a large number of high profile partners that helped shape our application priority areas (digital economy, medicine, engineering, public health, science) and that will co-fund and co-supervise PhD students, as well as co-deliver teaching elements.
Planned Impact
The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.
Organisations
- Imperial College London, United Kingdom (Lead Research Organisation)
- University of California, Berkeley (Project Partner)
- UNAIDS (Project Partner)
- Bocconi University (Project Partner)
- Prowler.io (Project Partner)
- Element AI (Project Partner)
- Centres for Diseases Control (CDC) (Project Partner)
- Swiss Federal Inst of Technology (EPFL), Switzerland (Project Partner)
- Columbia University, United States (Project Partner)
- ACEMS (Project Partner)
- Carnegie Mellon University, United States (Project Partner)
- Manufacturing Technology Centre, United Kingdom (Project Partner)
- The Alan Turing Institute (Project Partner)
- Institute of Statistical Mathematics (Project Partner)
- QuantumBlack (Project Partner)
- University of British Columbia (UBC) (Project Partner)
- Albora Technologies (Project Partner)
- AIMS Rwanda (Project Partner)
- Cortexica Vision Systems Ltd (Project Partner)
- Mercedes-Benz Grand prix Ltd (Project Partner)
- Tencent (Project Partner)
- EURATOM/CCFE, United Kingdom (Project Partner)
- Microsoft Research Ltd, United Kingdom (Project Partner)
- Washington University in St. Louis (Project Partner)
- Los Alamos National Laboratory, United States (Project Partner)
- Ludwig Maximilians University Munich (Project Partner)
- Select Statistical Services, United Kingdom (Project Partner)
- Samsung Electronics Research Institute, United Kingdom (Project Partner)
- Centrica Plc, United Kingdom (Project Partner)
- Bill & Melinda Gates Foundation (Project Partner)
- University of Paris 9 Dauphine (Project Partner)
- BASF (Project Partner)
- Heidelberg Inst. for Theoretical Studies (Project Partner)
- ASOS Plc (Project Partner)
- JP Morgan Chase (Project Partner)
- University College London, United Kingdom (Project Partner)
- The Rosalind Franklin Institute (Project Partner)
- Dunnhumby (Project Partner)
- Office for National Statistics, United Kingdom (Project Partner)
- The Francis Crick Institute (Project Partner)
- Winnow Solutions Limited (Project Partner)
- African Institute for Mathematical Scien (Project Partner)
- DeepMind (Project Partner)
- Cogent Labs (Project Partner)
- Vector Institute (Project Partner)
- Filtered Technologies (Project Partner)
- Babylon Health (Project Partner)
- RIKEN, Japan (Project Partner)
- Queensland University of Technology, Australia (Project Partner)
- Cervest Limited (Project Partner)
- University of Leiden, Netherlands (Project Partner)
- Novartis Pharma AG, Switzerland (Project Partner)
- B P International Ltd, United Kingdom (Project Partner)
- Microsoft Corporation (USA), United States (Project Partner)
- Harvard University, United States (Project Partner)
- Facebook UK (Project Partner)
- Amazon Development Center Germany (Project Partner)
- Schlumberger Cambridge Research Ltd, United Kingdom (Project Partner)
- Qualcomm Incorporated (Project Partner)
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/S023151/1 | 31/03/2019 | 29/09/2027 | |||
2247853 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Stefanos Bennett |
2282781 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Phillip Murray |
2247868 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Jason Clarkson |
2282778 | Studentship | EP/S023151/1 | 30/09/2019 | 31/03/2024 | Enrico Crovini |
2635637 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Stefanos Bennett |
2284087 | Studentship | EP/S023151/1 | 30/09/2019 | 31/12/2023 | Melodie Monod |
2641932 | Studentship | EP/S023151/1 | 30/09/2019 | 31/12/2023 | Melodie Monod |
2247701 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Michael Hutchinson |
2283002 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | James Wei |
2283474 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Antoine Meyer |
2635640 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Jason Clarkson |
2247869 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Anna Menacher |
2605899 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Adam Howes |
2247906 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Sahra Ghalebikesabi |
2283505 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Harrison Zhu |
2284224 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Chak Hin Bryan Liu |
2260831 | Studentship | EP/S023151/1 | 30/09/2019 | 29/09/2023 | Daniel Moss |
2420792 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Desislava Ivanova |
2420820 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Zoi Tsangalidou |
2635641 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Oscar Clivio |
2635642 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Lucile Ter-Minassian |
2635643 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Zoi Tsangalidou |
2420772 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Andrew Campbell |
2248365 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | James Topping |
2420816 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Lucile Ter-Minassian |
2420649 | Studentship | EP/S023151/1 | 30/09/2020 | 29/09/2024 | Oscar Clivio |
2446052 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Andrea Brizzi |
2605897 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Xing Liu |
2442432 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Tresnia Berah |
2446166 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Thomas Matcham |
2605889 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Alexander larionov |
2605900 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Ben Tu |
2445745 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Benjamin Howson |
2632835 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Michael Komodromos |
2605895 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Jose Pablo Folch Urroz |
2605902 | Studentship | EP/S023151/1 | 02/10/2020 | 29/09/2024 | Michael Komodromos |
2565026 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Nicholas Steyn Steyn |
2564817 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Angus Phillips |
2564803 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Max Anderson Loake |
2565020 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Vikrant Shirvaikar |
2564812 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Alex Buna Marginean |
2564794 | Studentship | EP/S023151/1 | 30/09/2021 | 29/09/2025 | Joseph Benton |
2602507 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Efthymios COSTA |
2602530 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Shahriar Hasnat kazi |
2602749 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Yijin Zeng |
2602756 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Quiquan Wang |
2602524 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Emmeran chen Johnson |
2602754 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Fabio Feser |
2602755 | Studentship | EP/S023151/1 | 01/10/2021 | 29/08/2025 | Yu Chen |
2740638 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Alexander Forster |
2740715 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Samuel Howard |
2740634 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Stefano Cortinovis |
2740724 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | George Hutchings |
2740739 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Anya Sims |
2740743 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Jeffrey Tse |
2740734 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Nicolas Petit |
2740612 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Deepak Badarinath |
2740759 | Studentship | EP/S023151/1 | 30/09/2022 | 29/09/2026 | Linying Yang |
2749396 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Pavithra Srinath |
2748829 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Marcos Tapia Costa |
2748527 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Brendan Martin |
2748915 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Guiomar Pescador Barrios |
2748969 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Hetvi Jethwani |
2748724 | Studentship | EP/S023151/1 | 02/10/2022 | 29/09/2026 | Joshua Corneck-Willcox |