EPSRC Centre for Doctoral Training in Data Science
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
Overview: We propose a Centre for Doctoral Training in Data Science. Data science is an emerging discipline that combines machine learning, databases, and other research areas in order to generate new knowledge from complex data. Interest in data science is exploding in industry and the public sector, both in the UK and internationally. Students from the Centre will be well prepared to work on tough problems involving large-scale unstructured and semistructured data, which are increasingly arising across a wide variety of application areas.
Skills need: There is a significant industrial need for students who are well trained in data science. Skilled data scientists are in high demand. A report by McKinsey Global Institute cites a shortage of up to 190,000 qualified data scientists in the US; the situation in the UK is likely to be similar. A 2012 report in the Harvard Business Review concludes: "Indeed the shortage of data scientists is becoming a serious constraint in some sectors." A report on the Nature web site cited an astonishing 15,000% increase in job postings for data scientists in a single year, from 2011 to 2012. Many of our industrial partners (see letters of support) have expressed a pressing need to hire in data science.
Training approach: We will train students using a rigorous and innovative four-year programme that is designed not only to train students in performing cutting-edge research but also to foster interdisciplinary interactions between students and to build students' practical expertise by interacting with a wide consortium of partners. The first year of the programme combines taught coursework and a sequence of small research projects. Taught coursework will include courses in machine learning, databases, and other research areas. Years 2-4 of the programme will consist primarily of an intensive PhD-level research project. The programme will provide students with breadth throughout the interdisciplinary scope of data science, depth in a specialist area, training in leadership and communication skills, and appreciation for practical issues in applied data science. All students will receive individual supervision from at least two members of Centre staff. The training programme will be especially characterized by opportunities for combining theory and practice, and for student-led and peer-to-peer learning.
Skills need: There is a significant industrial need for students who are well trained in data science. Skilled data scientists are in high demand. A report by McKinsey Global Institute cites a shortage of up to 190,000 qualified data scientists in the US; the situation in the UK is likely to be similar. A 2012 report in the Harvard Business Review concludes: "Indeed the shortage of data scientists is becoming a serious constraint in some sectors." A report on the Nature web site cited an astonishing 15,000% increase in job postings for data scientists in a single year, from 2011 to 2012. Many of our industrial partners (see letters of support) have expressed a pressing need to hire in data science.
Training approach: We will train students using a rigorous and innovative four-year programme that is designed not only to train students in performing cutting-edge research but also to foster interdisciplinary interactions between students and to build students' practical expertise by interacting with a wide consortium of partners. The first year of the programme combines taught coursework and a sequence of small research projects. Taught coursework will include courses in machine learning, databases, and other research areas. Years 2-4 of the programme will consist primarily of an intensive PhD-level research project. The programme will provide students with breadth throughout the interdisciplinary scope of data science, depth in a specialist area, training in leadership and communication skills, and appreciation for practical issues in applied data science. All students will receive individual supervision from at least two members of Centre staff. The training programme will be especially characterized by opportunities for combining theory and practice, and for student-led and peer-to-peer learning.
Planned Impact
The proposed Centre has the potential to bring significant economic benefit to the UK. Data science has applications throughout industry, commerce, science, and the public sector. Methods based on data science are coming to underly digital commerce, energy sustainability, and digital health care. The application areas that benefit from data science are truly diverse, ranging from genome sequencing to social media, from energy analytics to translational medicine. Our broad consortium of partners reflects the huge number and variety of users of data science methods. The Centre will help to address the immense skills need for data science (see summary, above), bringing about economic benefit to the UK. A deep talent pool of data scientists is likely to provide a strong incentive for companies that require these skills to expand their operations in the UK.
The UK government has recognized the need for increased university provision in data science. The Council for Science and Technology, part of the UK government's Department of Business, Innovation, and Skills, recently recommended to Prime Minister David Cameron: "Computer science departments should work in partnership with other university departments and with the private sector to develop multidisciplinary courses with a suitable focus on building aptitude for the practical application of data science. Universities should be encouraged to develop new options including Data Science MSc and PhD programmes.'' (7 June 2013)
As additional economic benefit, the concentration of excellent students will naturally lead to exciting startups and spinouts. The University of Edinburgh is number one in the UK for spin-out and start-up creation, having recorded 250 startups and spinouts since 2000, with 47 such companies arising from the School of Informatics in the past six years. We have a rich existing infrastructure to support students in commercializing their ideas, including business training and events for connecting students with potential business partners and investors.
Additionally, there is a large potential social benefit to data science. Many charities and public sector organizations have large data sets that they wish to understand in order to create social value. A prime example of this are our project partners the City of Edinburgh Council, who wish to combine a large number of disparate resources to build a unified view of a citizen that can be used to improve social services. The skilled cadre of data scientists that will be produced by our Centre will have the potential to bring new techniques to bear on these longstanding problems.
The UK government has recognized the need for increased university provision in data science. The Council for Science and Technology, part of the UK government's Department of Business, Innovation, and Skills, recently recommended to Prime Minister David Cameron: "Computer science departments should work in partnership with other university departments and with the private sector to develop multidisciplinary courses with a suitable focus on building aptitude for the practical application of data science. Universities should be encouraged to develop new options including Data Science MSc and PhD programmes.'' (7 June 2013)
As additional economic benefit, the concentration of excellent students will naturally lead to exciting startups and spinouts. The University of Edinburgh is number one in the UK for spin-out and start-up creation, having recorded 250 startups and spinouts since 2000, with 47 such companies arising from the School of Informatics in the past six years. We have a rich existing infrastructure to support students in commercializing their ideas, including business training and events for connecting students with potential business partners and investors.
Additionally, there is a large potential social benefit to data science. Many charities and public sector organizations have large data sets that they wish to understand in order to create social value. A prime example of this are our project partners the City of Edinburgh Council, who wish to combine a large number of disparate resources to build a unified view of a citizen that can be used to improve social services. The skilled cadre of data scientists that will be produced by our Centre will have the potential to bring new techniques to bear on these longstanding problems.
Organisations
- University of Edinburgh (Lead Research Organisation)
- Xerox Europe (Project Partner)
- James Hutton Institute (Project Partner)
- Google (United States) (Project Partner)
- Biomathematics and Statistics Scotland (Project Partner)
- IBM (United Kingdom) (Project Partner)
- Digital Curation Centre (Project Partner)
- Centrum Wiskunde & Informatica (Project Partner)
- Quorate Technology Limited (Project Partner)
- Massachusetts Institute of Technology (Project Partner)
- HSBC Holdings (Project Partner)
- Digital Catapult (Project Partner)
- University of Pennsylvania (Project Partner)
- Leonardo (United Kingdom) (Project Partner)
- Carnegie Mellon University (Project Partner)
- IDIAP Research Institute (Project Partner)
- UCB Pharma (United Kingdom) (Project Partner)
- Psymetrix Limited (Project Partner)
- Amor Group (Project Partner)
- Institute of Science and Technology Austria (Project Partner)
- TimeOut (Project Partner)
- Google (United Kingdom) (Project Partner)
- Royal Bank of Scotland (United Kingdom) (Project Partner)
- Helsinki Institute for Information Technology (Project Partner)
- BrightSolid Online Innovation (Project Partner)
- University of Washington (Project Partner)
- Technical University of Berlin (Project Partner)
- CITY OF EDINBURGH COUNCIL (Project Partner)
- Agilent Technologies (United Kingdom) (Project Partner)
- Microsoft Research (United Kingdom) (Project Partner)
- Apple (United States) (Project Partner)
- Yahoo! Labs (Project Partner)
- Cloudsoft Corporation (Project Partner)
- Skyscanner (Project Partner)
- Pharmatics Ltd (Project Partner)
- Carnego Systems (United Kingdom) (Project Partner)
- Freescale Semiconductor Uk Ltd (Project Partner)
- British Broadcasting Corporation (United Kingdom) (Project Partner)
- Amazon (United Kingdom) (Project Partner)
- Open Data Institute (Project Partner)
- The University of Texas at Austin (Project Partner)
- Saarland University (Project Partner)
- Scottish Power (United Kingdom) (Project Partner)
- AlertMe (United Kingdom) (Project Partner)
- Oracle (United States) (Project Partner)
- SICSA (Project Partner)