Developing a Census Based Generative Geodemographic Classification System

Lead Research Organisation: University of Liverpool
Department Name: Geography and Planning

Abstract

Leveraging the power of contemporary Artificial Intelligence (AI), this project aims to revolutionize the way in which we can build and use geodemographic classifications. This will do so by enabling more accurate representations of socio-spatial structure and lowering barriers to census based classification development. It also proposes a user-friendly online tool that will allow anyone to easily create their own tailored, research-ready census-based geodemographic data product.

Geodemographic classifications provide useful and policy-relevant representations of the complex and multidimensional characteristics of populations living within small geographic areas. Classifications have been created using components of census data since the 1970s, with notable examples in 2001, 2011 and 2021 when the ONS co-produced the first open geodemographic classifications for the UK with academic partners. These "Output Area Classifications" (OAC) have garnered wide use and inspired localised models for specific geographic areas such as London (LOAC).

The core methods used to build geodemographic classification have however remained reasonably static since the 1970s, with only modest update. Furthermore, the creation of classifications also remains a reasonably technical process, limiting the ability for others to produce their own classifications, either for localities or specific purposes.

This proposal argues that recent developments in AI, and specifically deep learning and machine learning, show great potential to radically transform the power and utility of geodemographic classification. Firstly, through the creation of more accurate representations of socio-spatial structure; and, secondly, through improved geodemographic information systems that significantly reduce barriers to developing new classifications

Aims and Objectives
The aim of this project is to update the established methods used to build Census based geodemographic classifications through the integration of AI into:

The more automated development of output area level input measures that better account for non-linear geographic relationships between variables.
A tool to that enables the automated description of clusters.
Enabling the creation of a new public facing and online geodemographic classification system that will enable custom census-based classifications to be created.

This will be achieved through the following objectives:

Evaluating the use of autoencoders as a new method of data reduction for output area level geodemographic input measures.
Developing an operational machine learning pipeline that takes output area level census inputs through to cluster creation.
Utilising a large language model (LLM: such as integrated into ChatGPT), to develop an automated geodemographic descriptive tool capable of producing accurate textual descriptions of cluster characteristics.
Producing a public facing online tool and accompanying training that will guide users to create their own research-ready census-based geodemographic data products.

Publications

10 25 50