Developing interactive notebooks to support algorithm transparency in data-driven government

Lead Research Organisation: University of Leeds
Department Name: Sch of Geography

Abstract

Data Science techniques are increasingly being deployed in government to support the targeting of policy, resources and operational activities. For such activity to be genuinely adopted, close collaboration is needed between analysts applying techniques and domain practitioners who must interpret their outputs when communicating decisions or making policy recommendations. For example, in order to generate data-driven models that are theoretically and ecologically valid, subject-specific expertise is required in the selection of relevant input datasets, when training and evaluating different model specifications and when communicating the outputs of models to decision-makers.

This project will connect with a programme of work currently being developed by data scientists and expert policy makers at Leicestershire County Council in the domains of Adult Social Care, Youth Services and Transport. The plan is to develop, build and evaluate notebook-enabled visualization tools that insert policy-makers into the data science process (see Wattenberg et al. 2019 for a characteristic example): exposing the underlying mechanisms behind models and promoting intuition around model probabilities using modern techniques for uncertainty visualization. The project will lead to data-driven models that are transparent and ecologically valid and will address challenges identified variously in academia and numerous government white papers on the theme of algorithm transparency in data-driven government.

Project outputs include technically-focused academic papers reporting on the design and evaluation of tools for exposing and communicating data-driven outputs; and empirically-focussed papers detailing models generated through the use of such tools. In order to build engagement and critique from the wider data science community, the project will have an accompanying github organisation page, with separate repositories for discrete projects. These might focus on different domain areas -- Adult Social Care, Young People, Crime and Transport -- or different stages of data analysis process. For example, some tools will be targeted at model-building -- supporting model parameterisation and query - others at the communication of model outputs - robustness checks and uncertainty communication. Generalisable aspects of notebook design will be packaged into software libraries. Through this, the ambition is to demonstrate a workflow, with compelling examples, for effecting meaningful data-driven policy development.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/T002085/1 01/10/2020 30/09/2027
2747872 Studentship ES/T002085/1 01/10/2022 30/09/2026 Juliana Novaes