Machine Learning and Dimension Reduction Methods for High-Dimensional Datasets

Lead Research Organisation: Cardiff University
Department Name: Sch of Mathematics

Abstract

In today's environment where computer processors are powerful and computer memory cheap, researchers are able to collect and store huge amounts of data. Analysing that data needs sophisticated statistical and computational methods as most classic statistical methodology was developed at an era where data collection was not as easy and datasets where a lot of orders of magnitude smaller. Sufficient dimension reduction (SDR) is a class of methods for feature extraction in regression and classification problems with the purpose of reducing the size of a multidimensional dataset to a few important features. This has the potential of improving visualization of the most important relationships between the variables.

This project focuses on the improvement of existing methodology for more accurate and computationally faster estimation algorithms to achieve SDR. Among the most interesting suggestions in the literature uses machine learning algorithms and more specifically Support Vector Machines (SVM). The method although powerful can be improved in different directions and therefore there are a number of directions that a student can take on this project. A few examples are: to derive new SDR methodology robust to outliers; to derive Sparse SDR methodology; to derive SDR methodology when we have missing predictors; to derive SDR methodology for functional data and many more.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509449/1 01/10/2016 30/09/2021
1799692 Studentship EP/N509449/1 01/10/2016 30/06/2020 Hayley Randall