Integrated Census Microdata (I-CeM)
Lead Research Organisation:
University of Cambridge
Department Name: Geography
Abstract
From 1851 onwards the decennial British census returns contain vast amounts of comparable information on every household and individual in the country, and are the basis of much of our knowledge of changing social and economic structures in the period. Traditionally, however, the analysis of these sources was time-consuming, involving inputting data from the manuscript census returns into computer systems for analysis. For many years this limited the geographical scale and time periods of the research that could be undertaken using individual-level census materials, constrained the sorts of questions that could be asked, and added to the costs of research, while also severely limiting the opportunities offered for teaching.
The 2009-2013 ESRC-funded Integrated Census Microdata (I-CeM) project, led by Schürer, completely transformed this situation by bringing together computerised versions of the censuses of Great Britain for the period 1851 to 1911. The underlying data for that project had been created via a public/private partnership, mainly for genealogical purposes, at a cost of c.£9 million, but were made available to Schürer to generate a new version of the census data for academic use. In order to maximise the quality, comparability, and usefulness of these digitised census returns, the earlier I-CeM project undertook a number of key tasks, including reformatting, checking and cleaning the data; developing standard coding schemes for occupational and other data; coding the data; standardising administrative boundaries for the periods covered; and the creation of a range of derived variables mainly focused on household membership, structure and composition. In addition, parish-level GIS files were created to enable the resulting census data to be mapped.
The data files created as a result of the earlier I-CeM project have subsequently been curated and disseminated by the UKDS via two platforms developed under the provisions of the initial ESRC funding. One is, essentially, an online interactive download tool and the other an online data tabulation tool. The data were further supported by the creation of a dedicated project website providing researchers with a comprehensive 280-page user guide and a range of associated meta-data.
The existing I-CeM data collection has already generated numerous research publications across a wide range of disciplines, including, demography, geography, history, economics, sociology, management and health studies - as well as supporting numerous Ph.D., Masters and undergraduate dissertations. In addition, reaching beyond academia, the data can be tabulated online using a version of the Nesstar system allowing family and local historians, school children and others to generate bespoke tables from the underlying raw data. Importantly, because the I-CeM datasets are complete censuses rather than samples, in addition to enabling multiple detailed small scale local studies, the release of I-CeM has allowed research on new subjects and on a scale not previously possible, in turn leading to a number of major UKRI-funded projects.
This project will add the recently-released 1921 censuses to I-CeM - an additional 42.8 million individual records. As in the earlier I-CeM project, the 1921 transcriptions will be reformatted, checked, cleaned and importantly enhanced with a series of standardised codes and derived variables, without which the data are largely unusable for research purposes. The new data, together with an enhanced and up-dated version of the existing data for 1851 to 1911 will then be transferred to the UKDS for curation and future access. An important element of this project will be to upgrade and improve the existing data dissemination platforms, which are now some 10 years old. The data access will be supported by the creation of a new User Guide and associated metadata, made available via an upgraded I-CeM project website.
The 2009-2013 ESRC-funded Integrated Census Microdata (I-CeM) project, led by Schürer, completely transformed this situation by bringing together computerised versions of the censuses of Great Britain for the period 1851 to 1911. The underlying data for that project had been created via a public/private partnership, mainly for genealogical purposes, at a cost of c.£9 million, but were made available to Schürer to generate a new version of the census data for academic use. In order to maximise the quality, comparability, and usefulness of these digitised census returns, the earlier I-CeM project undertook a number of key tasks, including reformatting, checking and cleaning the data; developing standard coding schemes for occupational and other data; coding the data; standardising administrative boundaries for the periods covered; and the creation of a range of derived variables mainly focused on household membership, structure and composition. In addition, parish-level GIS files were created to enable the resulting census data to be mapped.
The data files created as a result of the earlier I-CeM project have subsequently been curated and disseminated by the UKDS via two platforms developed under the provisions of the initial ESRC funding. One is, essentially, an online interactive download tool and the other an online data tabulation tool. The data were further supported by the creation of a dedicated project website providing researchers with a comprehensive 280-page user guide and a range of associated meta-data.
The existing I-CeM data collection has already generated numerous research publications across a wide range of disciplines, including, demography, geography, history, economics, sociology, management and health studies - as well as supporting numerous Ph.D., Masters and undergraduate dissertations. In addition, reaching beyond academia, the data can be tabulated online using a version of the Nesstar system allowing family and local historians, school children and others to generate bespoke tables from the underlying raw data. Importantly, because the I-CeM datasets are complete censuses rather than samples, in addition to enabling multiple detailed small scale local studies, the release of I-CeM has allowed research on new subjects and on a scale not previously possible, in turn leading to a number of major UKRI-funded projects.
This project will add the recently-released 1921 censuses to I-CeM - an additional 42.8 million individual records. As in the earlier I-CeM project, the 1921 transcriptions will be reformatted, checked, cleaned and importantly enhanced with a series of standardised codes and derived variables, without which the data are largely unusable for research purposes. The new data, together with an enhanced and up-dated version of the existing data for 1851 to 1911 will then be transferred to the UKDS for curation and future access. An important element of this project will be to upgrade and improve the existing data dissemination platforms, which are now some 10 years old. The data access will be supported by the creation of a new User Guide and associated metadata, made available via an upgraded I-CeM project website.
| Title | I-CeM Lookup Table - Building Type code |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable BTCODE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/364616 |
| Title | I-CeM Lookup Table - Household Structure code |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable HHD |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365096 |
| Title | I-CeM Lookup Table - Occupation code |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable OCCODE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365093 |
| Title | I-CeM Lookup Table - Relationship to Household Head code |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file relates specifically to the I-CeM data collection variable RELA. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365089 |
| Title | I-CeM Lookup Table -- Country and County of Birth |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variables BPCNTI and BPCNTRY |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365099 |
| Title | I-CeM Lookup Table -- Disability Codes |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file related specifically to the I-CeM data collection variables DISCODE1 and DISCODE2 |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365092 |
| Title | I-CeM Lookup Table -- Employer Codes, 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable EMPLOYERCODE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This data resource has been created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/369102 |
| Title | I-CeM Lookup Table -- HISCO Occupation Code |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable HISCO |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365098 |
| Title | I-CeM Lookup Table -- Hollerith Birthplace codes: England and Wales 1911; Scotland 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable HOLLERBP |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365111 |
| Title | I-CeM Lookup Table -- Hollerith Industry codes 1911 and 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable HOLLERIND |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365113 |
| Title | I-CeM Lookup Table -- Hollerith Occupation codes 1911 and 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable HOLLEROCC |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365097 |
| Title | I-CeM Lookup Table -- Institution Codes, 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable INSTCODE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/369099 |
| Title | I-CeM Lookup Table -- Language(s) Spoken Codes |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable LANGCODE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/369101 |
| Title | I-CeM Lookup Table -- Population tables for England and Wales (including Islands in the British Seas) 1851 to 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable PARID and associated place of enumeration variables |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365131 |
| Title | I-CeM Lookup Table -- Population tables for Scotland, 1851 to 1921 |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variable PARID and associated place of enumeration details |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/365133 |
| Title | I-CeM Lookup Table -- Standardised Place of Birth |
| Description | This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of this spreadsheet. This file is specifically related to the I-CeM data collection variables BPID and STANPLACE |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/369103 |
| Title | I-CeM Lookup Tables -- Consistent place of enumeration Code |
| Description | These three spreadsheets are designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at: https://www.campop.geog.cam.ac.uk/research/projects/icem/ Outline information on the I-CeM project are also provided on the README page of each spreadsheet. These files are collectively specifically related to the I-CeM data collection variables CONPARID and PARID The consistent place or parish of enumeration variable (CONPARID) is, by its nature a highly complex variable and so details related to it have been split across three distinct spreadsheets. 1) CONPARID consistent geography of enumeration 2) CONPARID consistent geography of enumeration by year and PARID 3) CONPARID consistent geography of enumeration - parish listing The first of these is the basic lookup for the CONPARID variable providing a label for each code. The second spreadsheet, giving CONPARID codes by year and parish identifier (PARID), allows users to match year-specific PARIDs to CONPARID and vice versa. Finally, those seeking to identify where particular parishes have been placed in each year should turn to the third spreadsheet. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This research resource for created to support users of the I-CeM data collection. |
| URL | https://www.repository.cam.ac.uk/handle/1810/369270 |
| Title | Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special Licence Access |
| Description | This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity. The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables. This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section. The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include: England and Wales: 1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)1881 - truncated at the 16th character (maximum I-CeM field length 50 characters). Scotland: for 1851-71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most 1851 - truncated at the 70th character1861 - truncated at the 76th character1871 - truncated at the 82th character1881 - truncated at the 50th character. Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This data resource for created to facilitate support research in various fields of economic and social research. It has been used to generate multiple publications and theses. See the project website for more details. |
| URL | https://beta.ukdataservice.ac.uk/datacatalogue/doi/?id=7856#2 |
| Title | Integrated Census Microdata (I-CeM), 1851-1911 |
| Description | The Integrated Census Microdata (I-CeM) project has produced a standardised, integrated dataset of most of the censuses of Great Britain for the period 1851 to 1921: England and Wales for 1851-1861, 1881-1921 and Scotland for 1851-1901 and 1921, making available to academic researchers, detailed information at parish level about everyone resident in Great Britain collected at most of the decennial censuses between 1851-1921. Users should note that the 1871 England and Wales census data and 1911 Scottish census data are not available via I-CeM. The original digital data has been coded and standardised. In addition, the original text and numerical strings have always been preserved in separate variables, so that researchers can go back to the original transcription. However, users should note that name and address details for individuals are not currently included in the database; for reasons of commercial sensitivity, these are held under Special Licence access conditions under SN 7856 for data relating to England, Wales and Scotland, 1851-1911 and SN 9281 for data relating to England and Wales, 1921. This study (7481) relates to the available anonymised data for 1851-1911, i.e. all available years except 1921. Data for England and Wales 1921 are available under SN 9280. The data are available via an online system at https://icem.ukdataservice.ac.uk/Latest edition informationFor the second edition (June 2024), the 1851-1911 data have been redeposited with amended and enhanced data values. Further information about I-CeM can be found on the I-CeM Integrated Microdata Project webpages. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This data resource for created to facilitate support research in various fields of economic and social research. It has been used to generate multiple publications and theses. See the project website for more details. |
| URL | https://beta.ukdataservice.ac.uk/datacatalogue/doi/?id=7481#3 |
| Title | Integrated Census Microdata (I-CeM), England and Wales, 1921 |
| Description | The Integrated Census Microdata (I-CeM), England and Wales, 1921 study contains the standardised England and Wales data for 1921. The Integrated Census Microdata (I-CeM) project has produced a standardised, integrated dataset of most of the censuses of Great Britain for the period 1851 to 1911: England and Wales for 1851-1861, 1881-1921 and Scotland for 1851-1901, and 1921 making available to academic researchers, detailed information at parish level about everyone resident in Great Britain collected at most of the decennial censuses between 1851-1921. The name and address details for individuals are not currently included in the database; for reasons of commercial sensitivity, these are held under Special Licence access conditions under SN 9281 Integrated Census Microdata (I-CeM) Names and Addresses, England and Wales, 1921: Special Licence Access. See the catalogue record for 9281 for instructions on how to apply for those data.These data are available via an online system at https://icem.ukdataservice.ac.uk/ Further information about I-CeM can be found on the I-CeM Integrated Microdata Project webpages. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | This data resource for created to facilitate support research in various fields of economic and social research. It has been used to generate multiple publications and theses. See the project website for more details. |
| URL | https://beta.ukdataservice.ac.uk/datacatalogue/doi/?id=9280#1 |