dimensionality reduction when causal inference is the goal

Lead Research Organisation: London School of Economics and Political Science
Department Name: Statistics

Abstract

The vast quantities of data generated in the last decade presents a blessing and a curse for econometrics. It is well documented that 90% of all available data was generated in the last two years alone, thus allowing for a far richer spectrum of regressors, instruments, and controls than previously possible. Conversely, the 'large p small n' paradigm poses numerous challenges for conventional econometrics. At the most fundamental level, when pn the ordinary least squares estimator ceases to be unique. A more interesting problem is posed by spurious collinearity among regressors when p is large, which in turn increases the likelihood of erroneous model selection . As a potential topic for doctoral studies, I am interested in investigating the properties of econometric models for highly dimensional data estimated under sparsity assumptions. While a large body of literature already exists on forecasting in data-rich environments, my interests center specifically around building models for causal inference. At this early stage, I am interested in extending the modelling procedure developed by Chernozhukov et al. (2017); their approach is particularly interesting to me, as causal parameters can be consistently estimated even under erroneous model selection. As a starting point, I note that variable selection via tilting, as proposed by Cho and Fryzlewicz seems particularly amenable to the approach. I would also be interested in extending the approach to data that is correlated in time and space, however I note that significant progress has already been made by Chernozhukov, et al. (2018).

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/P000622/1 01/10/2017 30/09/2027
2097182 Studentship ES/P000622/1 01/10/2018 29/12/2022 Shakeel Gavioli-Akilagun