Using Secondary Data to Measure, Monitor and Visualise Spatio-Temporal Uncertainties in Geodemographics

Lead Research Organisation: University of Liverpool
Department Name: Geography and Planning

Abstract

Geodemographic classifications are categorical summary measures of the built, social and economic characteristics of small geographical areas. There are many geodemographic classifications supplied by both the commercial and non-commercial sectors; however, the most widely used non-commercial geodemographic is the Office for National Statistics (ONS) Output Area Classification (OAC) which was constructed entirely from the 2001 Census data. This classification has generated great impact with a user community drawn principally from across the public sector (www.areaclassification.org). The success and growing user base of 2001 OAC has led to the ONS supporting the construction of a new classification utilising the 2011 Census data once released.

A common criticism levied at classifications which are built entirely from Census data is that they are insufficiently contemporary given that collated data are not available for some years after each Census, and, that over time, these data may become increasingly unrepresentative of current population and built structures within small areas. The 2001 OAC methodology contained no mechanism through which these errors could be assessed over time, or for the classification to be reviewed and possibly updated utilising more recent data.

As such, within the historical context of the 2001 OAC, this project will integrate secondary data to implement a methodology of screening of small area residential structures over time, and create temporal measures of uncertainty that can be both communicated to policy-practitioner audiences, and furthermore, be utilised to provide updates to the 2001 OAC. This work will establish a methodology which can then be used prospectively in the construction of the 2011 OAC.

Planned Impact

Who will benefit from this research?

This research project will directly benefit users of the current 2001 OAC classification; and furthermore, through future implementation of the methodological framework that will be developed by this research project, any prospective users of the 2011 classification once developed. As such, specific beneficiaries of this project will principally be drawn from the public sector where the main user base of OAC resides. This includes numerous local authorities, health sector analysts, education policy makers and the police and justice sector. There is likely more limited use within the private sector; however, this is likely to grow with OAC 2011 given the more user friendly outputs and associated profiling tools that are planned.
A by no means comprehensive list of end users for this research is suggested by the case study section on the OACUG website: http://areaclassification.org.uk/case-studies/. The OACUG mailing list comprises around 200 registered addresses and the events held annually usually attract around 100 people. Subscribers to the OACoding software now total 288 users. This is not the full extent of OAC users, which is much higher given that the classification can be downloaded without registration from a number of sources.

How will they benefit from this research?

Geodemographic classifications are used principally by the public sector to improve service delivery through generalised mapping of the characteristics of small areas. Typically the use of these representations concern policy targeting and intervention strategies that are designed to improve living conditions of local populations. For example, these types of classification may be used to target health screening to "at risk" populations; or, to select schools which might be most appropriately visited by a university as part of widening participation initiatives.
Given that current Census based geodemographics do not capture the dynamics of change in small areas over time, these representations may eventually become less effective for describing the characteristics of areas; and as such, erroneous decisions could be made. For applications in the public sector, this has significant implications as the outcome targeting could result in changes in life chances for populations. For example, local population change might result in the aggregate characteristics of an area moving from a low to high risk for some negative health outcome. Targeting without a classification that incorporates these temporal changes may result in the populations of these areas not being offered an appropriate health intervention strategy. As such, the methodology that will be implemented by this research is critically important for public sector uses of geodemographics influencing life chances, as it provides the critical temporal review and potential update of a classification, and additionally a spatially located measure of uncertainty over time.

Publications

10 25 50
 
Description In methodological terms, this work highlighted challenges of working with highly granular Output Area (OA) level secondary data where collections are fragmented between constituent UK countries. This hampers the creation of temporally updatable geodemographic products because of compatibility and availability of different attributes between countries. Research findings have highlighted how LSOA / Data Zone geography may be a more appropriate level through which future UK secondary data based classifications can be constructed. Furthermore, the re-use of secondary data to supplement census attributes has wide ranging (and sometimes undesirable) consequences for temporally updatable OA level classifications. Such issues arose in part because non-census data show a greater degree of inter-correlation than the census attributes for which they serve as proxies. This is manifest in significant differences when compared to classifications built with solely with census data, or built using a mixed set of attributes. Given such challenges, secondary data were shown to be most effectively used to create uncertainty measures rather than in a hybrid combinations of census and non-census data.

A geodemographic model for the US was developed from small area estimates derived from the American Community Survey (ACS), which is a resource that to a great extent has replaced much of the detail that would traditionally be measured by a Census of the Population. This international case study was useful in the context of the 'Beyond 2011' debate about the future of the UK Census, in demonstrating the large error margins of ACS data when associated with neighbourhood scale geodemographic classification. We made the case in this work that a solution to this problem is a shift from a variable-based mode of inquiry to one of multidimensional contexts (e.g. geodemographics). While single ACS variables can be methodologically problematic to use for disaggregate geography, a large collection of such variables provides utility as composite measure of places under investigation. Given the uncertainty of the census remaining into the future, and the potential of alternative non-census sources, we argue that this particular work could hold great significance as our spatial data economies continue to evolve.

Finally, this grant has developed a technique for the compilation of large scale cartographic outputs that can enable the visualisation of multiple attributes about places without the need to consult web maps. Such techniques have hitherto been difficult to develop with "off the shelf" GIS tools; however, through coupling of open source tools a range of cartographic products has been created for local authority districts (e.g. 2011 Census Atlas: http://www.alex-singleton.com/r/2014/02/05/2011-census-open-atlas-project-version-two/).
Exploitation Route We anticipate that the classifications arising out of this research will be extended and implemented by a range of end users. The code used to create our classifications is in the public domain, and we anticipate that it will be adapted to create customized classifications for particular applications domains or geographic areas. Our related work in creating the 2011 Output Area Classification is enjoying wide use and has spawned a niche application to London (see cdrc.ac.uk/public). In an age of public expenditure restraint, the software and applications provide a useful free alternative to commercial solutions.

At the time of writing, we are in early negotiations with the Greater London Authority to develop and extend our research methodology to develop a geodmeographic classification principally from secondary sources. It is hoped that this will enable updating of the 2011 London Output Area Classification, that is widely used to broker planning applications and in a range of public resource allocation decisions.
Sectors Construction,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Government, Democracy and Justice,Retail

 
Description The grant has generated a series of classifications including: - temporal Output Area Classification for England - London Output Area Classification - New American Atlas (tract level classification for the US) - analytical refinements to the 2011 Output Area Classification (co-funded by the Office for National Statistics in a related stream of work) Usage has been stimulated through a variety of outreach activities including a training workshop (13 Participants: splits - 42% academic staff; 42% academic student; 8% public sector; 8% commercial); a public engagement website (http://www.opengeodemographics.com/ - 828 recorded users to date) and a visualisation portal (http://public.cdrc.ac.uk - 5500 users to date). Usage will grow though the Output Area Classification User Group, run in association with this grant (https://plus.google.com/u/0/communities/111157299976084744069) The immediate direct impact of this work has been the adoption of the source code developed for this project (and placed on github - https://github.com/alexsingleton/) in two applications involving mapping of census data. These include - James Reid (Edina) - Northern Ireland Atlas (http://ukbdev.edina.ac.uk/Census2011/) and James Trimble (http://ukdataexplorer.com/census/england). Further linkage has occurred with the Consumer Data Research Centre, and offers the potential for data sharing with other ESRC Phase 2 Big Data centres.
First Year Of Impact 2014
Sector Communities and Social Services/Policy,Construction,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Government, Democracy and Justice,Retail
Impact Types Societal,Economic

 
Description Census Atlas 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact The 2011 Census Atlas comprised an atlas for each local authority in England and Wales, with OA level mapping of all 2011 Census variables. This has been very popular - the blog posts associated with the project have been view around 2000 times.

The sharing of the code has enabled a number of spin off projects: James Reid - Northern Ireland Atlas (http://ukbdev.edina.ac.uk/Census2011/)
James Trimble (http://ukdataexplorer.com/census/england).
Year(s) Of Engagement Activity 2014
URL http://www.alex-singleton.com/r/2014/02/05/2011-census-open-atlas-project-version-two/
 
Description Course: Using R for Geodemographic Analysis 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Training was provided in geodemographics, and how these can be used / interpreted alongside caveats. Materials are online: http://rpubs.com/nickbearman/r-geodemographics and https://github.com/nickbearman/r-geodemographic-analysis-20140710

None
Year(s) Of Engagement Activity 2014
URL https://speakerdeck.com/nickbearman/using-r-for-geodemographic-analysis-thursday-10th-july-10-45am-4...
 
Description Invited talk: Advances in the Geodemographic Study of Population and Place 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact The talk introduced a group of demographers at Oxford to geodemographic profiling

None
Year(s) Of Engagement Activity 2014
URL https://speakerdeck.com/alexsingleton/advances-in-the-geodemographic-study-of-population-and-place
 
Description Invited talk: Cities and Context: The Codification of Small Areas through Geodemographic Classification 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact The paper is being taken forward into a collection on code/city.

None
Year(s) Of Engagement Activity 2014
URL https://speakerdeck.com/alexsingleton/cities-and-context-the-codification-of-small-areas-through-geo...
 
Description Invited talk: Transformative Research in Geographic Information Science 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact As part of a panel session, this talk stimulated debate around the merits of open geographic information science.

None.
Year(s) Of Engagement Activity 2014
URL https://speakerdeck.com/alexsingleton/transformative-research-in-geographic-information-science
 
Description Invited talk: What is so Big about Big Data? Some Observations on Open Data and Systems 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Very lively debate after the talk around definitions of big data

After the talk, we had further discussion about the appropriate use of the term Atlas V Map Book.
Year(s) Of Engagement Activity 2014
URL https://speakerdeck.com/alexsingleton/what-is-so-big-about-big-data-some-observations-on-open-data-a...