Integrated Census Microdata (I-CeM)

Lead Research Organisation: University of Cambridge
Department Name: Geography

Abstract

From 1851 onwards the decennial British census returns contain vast amounts of comparable information on every household and individual in the country, and are the basis of much of our knowledge of changing social and economic structures in the period. Traditionally, however, the analysis of these sources was time-consuming, involving inputting data from the manuscript census returns into computer systems for analysis. For many years this limited the geographical scale and time periods of the research that could be undertaken using individual-level census materials, constrained the sorts of questions that could be asked, and added to the costs of research, while also severely limiting the opportunities offered for teaching.

The 2009-2013 ESRC-funded Integrated Census Microdata (I-CeM) project, led by Schürer, completely transformed this situation by bringing together computerised versions of the censuses of Great Britain for the period 1851 to 1911. The underlying data for that project had been created via a public/private partnership, mainly for genealogical purposes, at a cost of c.£9 million, but were made available to Schürer to generate a new version of the census data for academic use. In order to maximise the quality, comparability, and usefulness of these digitised census returns, the earlier I-CeM project undertook a number of key tasks, including reformatting, checking and cleaning the data; developing standard coding schemes for occupational and other data; coding the data; standardising administrative boundaries for the periods covered; and the creation of a range of derived variables mainly focused on household membership, structure and composition. In addition, parish-level GIS files were created to enable the resulting census data to be mapped.

The data files created as a result of the earlier I-CeM project have subsequently been curated and disseminated by the UKDS via two platforms developed under the provisions of the initial ESRC funding. One is, essentially, an online interactive download tool and the other an online data tabulation tool. The data were further supported by the creation of a dedicated project website providing researchers with a comprehensive 280-page user guide and a range of associated meta-data.

The existing I-CeM data collection has already generated numerous research publications across a wide range of disciplines, including, demography, geography, history, economics, sociology, management and health studies - as well as supporting numerous Ph.D., Masters and undergraduate dissertations. In addition, reaching beyond academia, the data can be tabulated online using a version of the Nesstar system allowing family and local historians, school children and others to generate bespoke tables from the underlying raw data. Importantly, because the I-CeM datasets are complete censuses rather than samples, in addition to enabling multiple detailed small scale local studies, the release of I-CeM has allowed research on new subjects and on a scale not previously possible, in turn leading to a number of major UKRI-funded projects.

This project will add the recently-released 1921 censuses to I-CeM - an additional 42.8 million individual records. As in the earlier I-CeM project, the 1921 transcriptions will be reformatted, checked, cleaned and importantly enhanced with a series of standardised codes and derived variables, without which the data are largely unusable for research purposes. The new data, together with an enhanced and up-dated version of the existing data for 1851 to 1911 will then be transferred to the UKDS for curation and future access. An important element of this project will be to upgrade and improve the existing data dissemination platforms, which are now some 10 years old. The data access will be supported by the creation of a new User Guide and associated metadata, made available via an upgraded I-CeM project website.

Publications

10 25 50