Living with Machines

Lead Research Organisation: The Alan Turing Institute
Department Name: Research

Abstract

Living with Machines is both a research project, and a bold proposal for a new research paradigm. In this ground-breaking partnership between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL), historians, data scientists, geographers, computational linguists, and curators have been brought together to examine the human impact of industrial revolution.

It is widely recognised that Britain was the birthplace of the world's first industrial revolution, yet there is still much to learn about the human, social, and cultural consequences of this historical moment. Focussing on the long nineteenth century (c.1780-1920), the Living with Machines project aims to harness the combined power of massive digitised archives and computational analytical tools to examine the ways in which technology altered the very fabric of human existence on a hitherto unprecedented scale. The central theme - the mechanisation of work practices - speaks directly to present debates about how society can accommodate the revolutionary consequences of AI and robotics in what has become known as the fourth industrial revolution. To understand the fraught co-existence of human and machine, this project contends that we need research methods that combine technological innovation and human expertise.

The project will utilise the British Library's National Newspaper collection, and event-based records (census, electoral registration, births/ marriages/deaths, trade directories) collected by contributing partners Findmypast. By developing intuitive computational interfaces, and adapting collaborative practices developed in the field of software development, we will enable close interaction between computational methods and historical inquiry.

Outreach and Engagement will be central to the project from the outset, and will take two forms: familiar outcomes such as television programmes and regional exhibitions; and working with individuals and communities to create common understandings of their shared histories. Participatory aspects will embody best practices in crowdsourcing and citizen history.

Project benefits:

1. The UK's first large-scale synergy between data science, artificial intelligence research, and the arts and humanities, building capacity and catalysing new research areas.

2. The development of new computational techniques to marshal the UK's rich archival collections (digitised and born-digital), to enable new research questions to be posed of the holdings.

3. Enriched and interlinked data holdings for the British Library, to add additional context and value to content.

4. The development generalisable tools, code, and infrastructure that can be adapted for and inspire future interdisciplinary research projects.

5. New historical perspectives on the effects of the mechanisation of labour on the lives of ordinary people during the long nineteenth century.

6. The creation of computational models to represent how language and meanings change across time and geography.

7. Research breakthroughs maintaining UK global leadership in Digital Humanities and driving large-scale international partnerships and opportunities.

Planned Impact

Optional.
 
Title Newspaper Infographic Exhibition, British Library 
Description LwM took responsibility for one of the panels in the British Library's (forthcoming) exhibition of nineteenth-century newspaper infographics. In collaboration with the Library's Lead Curator of News, Luke McKernan, and Yann Ryan; Daniel Wilson (History, text) and Mariona Coll Ardanuy (Computational Linguistics, code) conceived of an experimental panel to showcase our research using sentiment analysis on historical newspapers. The panel was made by infographic designer Ciaran Hughes, using datasets provided by the project which focused on emotional responses to industrialisation as seen in newspaper headlines. The exhibition will involve six such panels and will use modern infographic presentational techniques on historical data to tell arresting new stories about nineteenth-century Britain. The exhibition will open on the Lower Ground Floor of the BL in Spring 2021 and will hopefully be seen by large numbers of people and be reported in the press itself. 
Type Of Art Artistic/Creative Exhibition 
Year Produced 2020 
Impact We hope the exhbition will be a corrective to the badly researched uses of historical newspapers to make methodologically unsound claims about the past, and instead showcase a more credible way to apply data science to historical materials, while simultaneously grabbing attention and showcasing the work of the project. 
 
Description Andre Piza participated in European Commission's "Study on Opportunities and challenges of Artificial Intelligence Technologies for the Cultural and Creative Sectors"
Geographic Reach Europe 
Policy Influence Type Contribution to a national consultation/review
 
Description British Library Research Report 2018-19
Geographic Reach National 
Policy Influence Type Citation in other policy documents
Impact The British Library Research Report features the Living with Machines project accounting for its impact on the British Library's activities. According to the report, "The project has already helped the Library explore the potential and challenges of data science methods, including copyright, the use of cloud-based services at scale, and meshing digitisation and analytical timeframes." *(1) and it is "advancing our [the BL's] capability to undertake computational analysis using very large and heterogeneous digitised sources, and our understanding of types of infrastructure that will enable us to deploy more data-driven research in the future."*(2) *(1) Mia Ridge, British Library's Digital Curator for Western Heritage Collections (and Co-I on Living with Machines) *(2) Maja Maricevic, British Library's Head of Higher Education and Science (and Co-I on Living with Machines)
URL https://www.bl.uk/news/2020/november/publication-of-2018-19-research-report
 
Description Guest lecture and assigned reading: Crowdsourcing at the British Library for UCL's MSc in Data Science for Cultural Heritage
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Guest lecture: Europeana masterclass for Open Digital Cultural Heritage
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Guest lecture: INOS project, overview of citizen science and crowdsourcing
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
URL https://inos-project.eu/2021/07/28/workshop-report-citizen-science-why-get-involved/
 
Description Participation as a case study in AHRC's Technician Commitment Action Plan
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact RTP case studies will be used for internal and external purposes. In the first instance, they will be shared with colleagues across the organisation to increase everyone's understanding of the term Research Technical Professional (technician) within the context of the Arts and Humanities. This will enable AHRC colleagues to confidently identify members of the RTP community working within their respective schemes and keep members of this community informed of how we are championing the Technician Commitment. Case Studies will help to ensure that decisions made at a strategic and governance level are well informed by the varied experiences of RTPs. In the future we will be keen to publish case studies on external platforms such as our website and communication channels, for example the AHRC newsletter and blog. Most importantly, Case Studies will inform research, dialogue and future activity with regards to AHRC's Technician Commitment Action Plan.
 
Description Project output used in collaborative workshop with Estonian museums
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
Impact Attendees developed skills at the workshop, enhanced by their access to our Open Access Handbook.
URL https://esm.ee/for-visitors/news/the-war-museum-helps-estonian-museums-to-put-crowdsourcing-into-use
 
Description Ruth Ahnert, fed into Forecasting Forum on the Future of Research, at thinktank Demos, December 5th 2019, London.
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
URL https://demos.co.uk/wp-content/uploads/2019/10/Jisc-OCT-2019-2.pdf
 
Description Congruence Engine -- Towards a National Collection Discovery Project -- Secondment for Daniel Wilson to Science Museum Group
Amount £3,000,000 (GBP)
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 11/2021 
End 07/2024
 
Description From crowdsourcing to digitally-enabled participation: the state of the art in collaboration, access, and inclusion for cultural heritage institutions
Amount £64,801 (GBP)
Funding ID AH/T013052/1 
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 02/2020 
End 12/2021
 
Description Machines Reading Maps: Finding and Understanding Text on Maps
Amount £199,529 (GBP)
Funding ID AH/V009400/1 
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 02/2021 
End 10/2022
 
Title Beavan, D., Jackson, M. Plain text and metadata extraction tool 
Description Tool for parallel processing of XML in METS/ALTO format for extraction of plain text and metadata fields, available in XSLT and Python versions. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This data wrangling tool facilitated downstream analysis of historical newspapers focussing on toponym resolution and OCR quality. It forms an essential part of the preprocessing pipeline that will be applied to new datasets whose acquisition is in progress. 
 
Title Beelen, K., Lexicon Expansion Interface 
Description Notebook for exploring word2vec models in order to build a lexicon that can trace certain topics in a collection. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact The Lexicon Expansion Interface allows users to navigate a vector space and expand a list of seed words into a Lexicon. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/lexicon-expansion/language-l...
 
Title Beelen, K., Lexicon Generator, a tool for generating contrastive lexicons using newspaper data 
Description Notebook for building a lexicon by contrasting two corpora using the Fightin' Words algorithm created by Monroe et al, 2008. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This notebook is an implementation of the Monroe et al algorithm "Fightin' Words". It is a feature extraction algorithm that computes which words are most significantly associated with with a specific subcorpus. This notebook helps us to "profile" certain types of language (e.g. contrast conservative to liberal newspapers) 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/lexi...
 
Title Beelen, K., Newspaper metadata database and search interface: scripts to build an ElasticSearch index and explore the data using Kibana 
Description Scripts to build an ElasticSearch index and explore the data using Kibana 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Newspaper metadata database and search interface. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/elast...
 
Title Beelen, K., Pipeline for processing the Newspaper Press Directories 
Description The series of notebooks includes a pipeline for processing the OCR (derived from the scans of Mitchell's Press Directories). The stages include: annotation, preprocessing, automatic tagging and database ingest. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This tool will be crucial for parsing and enriching implicitly structured data (such as the press directories, but also other historical sources). 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/ndp_p...
 
Title Code for Targeted Sense Disambiguation 
Description Code for Targeted Sense Disambiguation and reproducing results published in the http://dx.doi.org/10.18653/v1/2021.findings-acl.243 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact Reproducing results of the paper. Tools for historical sense disambiguation. 
URL https://github.com/Living-with-machines/TargetedSenseDisambiguation
 
Title Coll Ardanuy, M., Hosseini, K., van Strien, D., McDonough, K., Wilson, D., Krause, A., underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources' 
Description Underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'. Resolving Places is one of the first outputs of Living with Machines, a collaborative digital history project at The Alan Turing Institute and the British Library. This research is part of our work to build a nineteenth-century gazetteer that combines place names derived from historical sources (GB1900) with online resources (Wikipedia and Geonames). GB1900 is the result of a crowdsourced project that transcribed all text labels on the 2nd edition 6-inch to 1 mile Ordnance Survey maps of Great Britain (ca. 1900) held by the National Library of Scotland (NLS Maps online). The Living with Machines gazetteer follows best practices in combining multiple existing resources, and is novel in accounting for places that have different scales (e.g. streets, buildings, cities, counties). In the future, we will be adding records and enriching current records with information from OS map 1st edition map label data and other sources. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This work was presented at a workhsop on 27-28 November. Several attendants to the workshop showed interest in using the gazetteer produced through this code. Subsequent completed work and work in progress uses it, within and outside our project. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places/
 
Title Coll-Ardanuy, M., Code that builds a gazetteer from scratch 
Description Code and method to generate a gazetteer from Wikipedia and enriched with Geonames data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Part of larger workflow to create a geographical knowledge base that combines different 19thC knowledge sources together. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/gaze...
 
Title Coll-Ardanuy, M., Hosseini, K., Nanni, F., Toponym Matching 
Description This work looks for potential locations for each toponym identified in text, it addresses issue of high degree of variation in toponyms (due to regional spelling differences, transliterations strategies, cross-language and diachronic variation) and variations due to OCR errors. 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact We have built a flexible deep learning framework for candidate selection through toponym matching, using various state-of-the-art neural network architectures (DeezyMatch). The paper that accompanies this repository assesses the performance of DeezyMatch in different experimental settings. The DeezyMatch repository has had a notable impact, this accompanying repository is used for reference. 
URL https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching
 
Title Hobson, T., Tolfo, G. Methodological paper on Living with Machines' metamodel 
Description Data modelling methodology developed to underpin data infrastructure with the aim of promoting interoperability of tools and systems and accessibility of data and derived artefacts within the project and externally. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact The common data model developed by this method has been used in the design of relational database schemas and other research infrastructure to support interoperability across different source data types and varied research activities. 
URL https://www.overleaf.com/read/qjqqfdrqxkpr
 
Title Hosseini, K. and Vane, O. PressPicker code 
Description The PressPicker tool can be used to filter and visualise British Library holdings of undigitised newspapers as a function of time. It is also an interactive tool to pick newspaper titles (e.g. for digitisation). It consists of two Python Jupyter notebooks and a custom JavaScript interactive visualisation. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Successfully made two selections of newspaper titles for digitising within Living with Machines. 
 
Title Hosseini, K., Beelen, K., basic lexicon expansion algorithms using word embeddings 
Description In this notebook, we use the trained word embeddings (using word2vec or fasttext models) to explore the semantic space of our book and sample newspaper datasets. Several basic methods are implemented, e.g. explore the neighbouring words given a seed word (e.g., what are the most similar words to "machine" given our corpus?); visualisation of word vectors using t-SNE. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This work is in progress. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language-lab-mro/lexi...
 
Title Hosseini, K., Nanni, F., Coll-Ardanuy, M., DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching 
Description A free, open-source software library written in Python for fuzzy string matching and candidate ranking. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact String matching is an integral component of many natural language processing (NLP) pipelines. DeezyMatch, a new deep learning approach to fuzzy string matching and candidate ranking, is a free, open-source community software that strives to address advanced string matching and candidate ranking challenges in a more comprehensive and integrated manner than existing tools. DeezyMatch is written in the Python programming language. Thanks to its easy-to-use interfaces, DeezyMatch can be seamlessly integrated into existing entity linking systems. This allows DeezyMatch to be adopted outside the NLP community, especially in Digital Humanities, where it could play a major role in addressing known issues concerning the adoption of entity linking systems due to the non-standard nature of the datasets typically used in this field. DeezyMatch has been the topic of a tutorial and round table (at the LinkedPasts conference 2020) and of an interactive workshop (at the Alan Turing Institute Digital Humanities and Research Software Engineering Summer School, 2021). The GitHub repository has 64 stars and 26 forks. 
URL https://github.com/Living-with-machines/DeezyMatch
 
Title Hosseini, K., exploratory data analysis of GB1900 dataset 
Description A set of Jupyter-notebooks for visualisation and statistical analysis of GB1900 dataset. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These Jupyter-notebooks were developed to explore the GB1900 dataset, including visualisation of various entities (e.g., railway) on a map. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/space-time-mro/gb1900...
 
Title Hosseini, K., exploratory data analysis of newspaper/book databases 
Description A set of Jupyter-notebooks to perform exploratory data analysis on newspaper and book databases. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks were developed as teaching/research tools to: 1) show how to access a remote Postgres DB, query, plot the results. 1) exploratory data analysis (e.g., visualisation and simple statistical analysis) on the data. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/relational_database_e...
 
Title Hosseini, K., from raw data to language-models/word-embeddings 
Description These notebooks combined form a pipeline in which raw book/newspaper textual data can be accessed, preprocessed and then used to generate word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks (and their Python-script version) have been extensively used to generate word2vec, fasttext, Flair and BERT language models. These models are being used in several NLP-related projects. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/noteb...
 
Title Hosseini, K., intrinsic evaluation of word embeddings / language models 
Description The performance of any trained machine learning model needs to be evaluated (intrinsically or extrinsically) before being used. Here, we collected several datasets and developed a set of codes to evaluate trained word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Evaluation of all word-embeddings/language models being used in the project. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Hosseini, K., parallel processing of book (and newspaper) dataset using MPI (Message Passing Interface) 
Description As we are dealing with a large textual data (e.g., our book dataset contains 4.5B words), we started to experiment with different distributed and parallel algorithms to preprocess and to train machine learning models. Here, we used MPI (Message Passing Interface) through Python. This code distributes the job among the requested number of CPUs (workers) which can be on different nodes in a supercomputer (i.e. not limited to shared-memory machines); therefore, it significantly reduces the wall time. This code was tested on Urika. Unfortunately, Urika is not available anymore, and now, we are exclusively using Azure virtual machines (VM). These VMs are shared-memory, so we switched to simpler parallel-processing algorithms. However, the MPI algorithm and tools developed here should be usable later when we have access to even larger datasets. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Preprocess and extract information (e.g., part-of-speech tagging) from large textual datasets. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/mpi_v...
 
Title Hosseini, K., record linkage using various multi-class classifiers and manual annotations 
Description Record linkage across two noisy datasets (for example, historical texts) is a non-trivial task. In this tool, we experimented with different multi-class classifiers, e.g. decision tree and multilayer perceptron architectures. We also assessed the impact of features (e.g., title, date and place of publication) on the statistical performance of these models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Creating a list of linked entities between NPD (newspaper press directory) and British Library titles list. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/linki...
 
Title Hosseini, K., upload images to Zooniverse 
Description ~10,000 images from the digitised newspaper articles were selected and uploaded to Zooniverse for annotation. Defoe, a spark-based toolbox for analysing digital historical textual data, was used to select the images. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact The human/expert annotation is one of the main ingredients in training and evaluating supervised machine learning methods. The results of this experiment can be used in various tasks, e.g., sentence/document classification. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/communities-mro/zooni...
 
Title Living with Machines GitHub Stats report 
Description This repository automatically updates GitHub statistics data for the Living with Machines GitHub Organization and generates a report based on this data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact 38 unique repository views 
URL https://github.com/Living-with-machines/github_stats_report
 
Title Neural Language Models for Historical Research 
Description We have pre-trained four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings (i.e., word2vec and fastText), and four instances considering different time slices for BERT. 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact The repository has had several forks and the language models are already being used by several researchers external to the project. 
URL https://github.com/Living-with-machines/histLM
 
Title Repository for code underlying the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020) 
Description This repository provides underlying code and materials for the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020). 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact This is the code accompanying the paper "Living Machines: A study of atypical animacy" (2020). This paper has already been cited three times in external publications, and the GitHub repository has four external stargazers and one fork. The code in this paper has been used and adapted in a forthcoming publication from the Living with Machines project. 
URL https://github.com/Living-with-machines/AtypicalAnimacy/
 
Title Station to Station: Linking and Enriching Historical British Railway Data 
Description This repository provides underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data', accepted at the Computational Humanities Research conference (2021). It contains the steps to reproduce the experiments reported in the paper and to generate a structured version of the Michael Quick's book "Railway Passenger Stations in Great Britain: a Chronology". 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact This repository contains the code used to generate StopsGB (Structured Timeline of Passenger Stations in Great Britain, https://https//doi.org/10.23636/wvva-3d67). This dataset is currently being used in other projects within Living with Machines, and we believe it will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. The code used to generate a gazetteer is already being used in the Machine Reading Maps project. 
URL https://github.com/Living-with-machines/station-to-station
 
Title Vane, O. OS maps metadata visualisation code 
Description Custom visualisation of digitised 19th Century Ordnance Survey maps (from National Library of Scotland) to investigate patterns of map revision through time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Used tool to create supporting material for BL map digitisation proposal and to help identify suitable locations for historical case studies (factors include OS map coverage). 
 
Title Vane, O., Code for filtering Kings Topographical map collection metadata 
Description Python Jupyter notebook for filtering British Library KTop metadata by geography and time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Identifying relevant digitised material for Living with Machines research. 
 
Title Vane, O., Code underlying a blogpost about how to put a D3 JavaScript visualisation in a Python Jupyter notebook. 
Description Jupyter notebook demonstrating how to use JavaScript and the D3 visualisation library in a Python Jupyter notebook. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Email from a blog reader describing it as very helpful. 
URL https://github.com/alan-turing-institute/D3_JS_viz_in_a_Python_Jupyter_notebook
 
Title Vane, O., Strabo output visualisation code 
Description Visualising the output of 'Strabo' tool (software tool to auto-transcribe text in historical maps by researchers at the University of Southern California Spatial Informatics Laboratory). 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Non statistical evaluation of Strabo tool success with our map data. 
 
Title Working with maps at scale using Computer Vision and Jupyter notebooks (Notebook/code) 
Description Notebook showing how to use computer vision/Jupyter Notebooks to support working with image collections at scale. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Materials used at a workshop with ~30 attendees. 
URL https://github.com/Living-with-machines/maps-at-scale-using-computer-vision-and-jupyter-notebooks
 
Title gh_orgstats 
Description gh_orgstats is intended to provide some easy ways of getting stats for a GitHub org. gh_orgstats does this by wrapping some functions around PyGithub. This code is mainly intended to help generate reports as part of a GitHub actions pipeline to update GitHub usage stats for a funder. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact 56 unique GitHub Clones of the repository hosting the code 
URL https://github.com/Living-with-machines/gh_orgstats
 
Title van Strien, D., Beelen, K., Coll Ardanuy, M., Hosseini, K., McGillivray, B., Colavizza, G., underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks' 
Description These notebooks contain the underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'. The code runs experiments reported in the paper and generates the figures used in the paper. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact This code helps the project better understand issues relating to OCR technology and will inform research methods for our projects and other projects working with text produced through OCR. 
URL https://github.com/alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
 
Title van Strien, D., Beelen, K., McDonough, K. 4 Jupyter notebooks on basic computer vision methods for historic OS maps 
Description These notebooks provide an explanation on using computer vision methods with historic maps. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used in two workshops with >40 participants. They will be developed further into a series of tutorials. 
 
Title van Strien, D., Beelen, K., McDonough, K. 5 Jupyter notebooks on using Deep-learning methods for computer vision on historic OS maps 
Description Additional notebooks on using computer vision methods with historic digitised map collections. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used as teaching materials in two workshops and will be developed further into publicly available tutorials. 
 
Title van Strien, D., Prototype Maps annotation pipeline 
Description A prototype method for collecting annotations from researchers, running classification and analysing historic maps at scale. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? No  
Impact These methods have been used as an initial prototype which is currently being developed further inside the project. 
 
Title 19th Century United States Newspaper Advert Classifications 
Description A dataset of images drawn from the Library of Congress Newspaper Navigator Dataset (news-navigator.labs.loc.gov/). The dataset contains images and annotations used for training computer vision models to classify whether an adert is illustrated or not. This is a supplement to a forthcoming programming historian lesson (programminghistorian.org/) but can be used indepently of this lesson. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact The dataset will be made public to coincide with the release of the Programming Historian Tutorials. 
 
Title Book and newspaper databases 
Description This database consists of ~49K books (metadata and full-text, 4.5B words) and 11.8M newspaper pages (only metadata). We used "Azure Database for PostgreSQL" service to manage this database.Various codes/jupyter-notebooks are developed to access this database and perform exploratory data analysis. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact This database has been used in various text mining and natural language processing tasks, such as: 1) Generating language models including word2vec, fasttext, Flair and BERT type models. The book database was mainly used here as it has a large number of books suitable for training stable language models; however, we also trained several models using a sample from newspaper articles. 2) Pre-trained models used in "Assessing the Impact of OCR Quality on Downstream NLP Tasks" paper. 3) Developing the processing pipeline. 
 
Title British Library Books genre detection model 
Description This model is intended to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'. This model was trained on data created from the Digitised printed books (18th-19th Century) book collection. The datasets in this collection are comprised and derived from 49,455 digitised books (65,227 volumes), mainly from the 19th Century. This dataset is dominated by English language books and includes books in several other languages in much smaller numbers. This model was originally developed for use as part of the Living with Machines project to be able to 'segment' this large dataset of books into different categories based on a 'crude' classification of genre i.e. whether the title was `fiction` or `non-fiction`. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact Used as part of a forthcoming living with machines tutorial on genre classification 
URL https://doi.org/10.5281/zenodo.5245175
 
Title Dataset for Toponym Resolution in Nineteenth-Century English Newspapers 
Description We present a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions of places, which are linked---whenever possible---to their corresponding entry on Wikipedia. The dataset is published on the British Library shared research repository, and is especially of interest to researchers working on improving semantic access to historical newspaper content. We share the 343 annotated files (one file per article) in the WebAnno TSV file format version 3.2, a CoNLL-based file format. We additionally provide a TSV file with metadata at the article level, and the annotation guidelines. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This dataset has already been used by researchers working on the task of named entity recognition in historical digitised newspapers. This dataset will be used in the HIPE 2022 shared task ("Identifying Historical People, Places and other Entities", https://hipe-eval.github.io/HIPE-2022/) organised by the Impresso project, on "Named Entity Recognition and Linking in Multilingual Historical Documents". The dataset will be used by teams from different institutions to develop and assess the performance of state-of-the-art methods in the tasks of named entity recognition and entity linking. This is the second edition of the shared task, 13 teams participated in the first edition of this shared task. 
URL https://bl.iro.bl.uk/concern/datasets/de43a15c-e000-4fec-8b66-7ca94ae13db3
 
Title Dataset for Toponym Resolution in Nineteenth-Century English Newspapers 
Description We present version 2 of a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions of places, which are linked---whenever possible---to their corresponding entry on Wikipedia. The dataset is published on the British Library shared research repository, and is especially of interest to researchers working on improving semantic access to historical newspaper content. We share the 343 annotated files (one file per article) in the WebAnno TSV file format version 3.2, a CoNLL-based file format. We additionally provide a TSV file with metadata at the article level, and the annotation guidelines. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/f3686eb9-4227-45cb-9acb-0453d35e6a03
 
Title Digitised historical newspapers 
Description Newspapers digitised by the British Library for the LwM project, with OCR processing performed by FindMyPast and supplied in a format consistent with the BNA. The dataset comprises ~630 GB of digitised text in METS/ALTO XML format and 435,642 JP2 image files (~6 TB) for 94 newspaper titles. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact Analysis of British historical newspaper content at scale. 
 
Title Example computer vision classification training data derived from British Library 19th Century Books Image collection 
Description Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact 85 Downloads of the dataset 
URL https://zenodo.org/record/3689444
 
Title Example computer vision classification training data derived from British Library 19th Century Books Image collection 
Description Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. Currently, included are four folders containing a variety of images derived from the BL books corpus. 'cv_workshop_exercise_data' include images of: 'building', 'people', 'coat of arms' 'humancats' contains images of humans and images of cats The 'fashion' and 'portraits' folders both contain images of people organised into 'female' and 'male'. These labels were annotated by a single annotator and these categories may themselves not be meaningful. They are included in the workshop data as a point of discussion about how we should label data both in general and when working with historical data. This data is intended primarily as an educational resource. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/3667575
 
Title Example computer vision classification training data derived from British Library 19th Century Books Image collection 
Description Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. Currently, included are four folders containing a variety of images derived from the BL books corpus. 'cv_workshop_exercise_data' include images of: 'building', 'people', 'coat of arms' 'humancats' contains images of humans and images of cats The 'fashion' and 'portraits' folders both contain images of people organised into 'female' and 'male'. These labels were annotated by a single annotator and these categories may themselves not be meaningful. They are included in the workshop data as a point of discussion about how we should label data both in general and when working with historical data. This data is intended primarily as an educational resource. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/3689444
 
Title G. Tolfo, O. Vane, K. McDonough, Metadata for BL map collections including the King's Topographical Map collection and the Goad Fire Insurance Maps. 
Description Csv exports of British Library records for King's Topographical collection maps and Goad Fire Insurance maps. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact We are in the process of making this data interoperable with our other map metadata from the National Library oof Scotland, at which point we will release it to the public so that it is a tool for improving discovery of digitised map content in British heritage institutions. 
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). The Newspaper Navigator dataset consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Used at data for an example notebook showing how to train computer vision models. 59 downloads of the dataset (5/11/2020) 
URL https://zenodo.org/record/4156510
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4156509
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4156510
 
Title K. McDonough, O. Vane, A. Krause, C. Fleet, Metadata from National Library of Scotland Ordnance Survey map collections 
Description Shapefile metadata for Ordnance Survey map sheets received from Chris Fleet at the National Library of Scotland for analysis alongside digitised images oof NLS Ordnance Survey maps. We exported this data and built a relational database to make the data more accessible. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact By re-formatting this data and linking it to additional metadata, we are enabling a better understanding of a) where there are concentrations or gaps in the digital record and b) how revision practices varied over British space. 
 
Title Kasra Hosseini, language model zoo 
Description Collection of trained word embeddings and language models, mainly by using the book database. Various model types are trained and added to the collection, e.g., word2vec, fasttext, contextual string embeddings (Flair), BERT. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact Language models and word-embeddings are one of the main ingredients in many NLP-related tasks in this project. Here, we keep track of the trained models, so researchers can easily find the models and use them for their research. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Living Machines atypical animacy dataset 
Description Atypical animacy detection dataset, based on nineteenth-century sentences in English extracted from an open dataset of nineteenth-century books digitized by the British Library (available via https://doi.org/10.21250/db14, British Library Labs, 2014). This dataset contains 598 sentences containing mentions of machines. Each sentence has been annotated according to the animacy and humanness of the machine in the sentence. This dataset has been created as part of the following paper: Ardanuy, M. C., F. Nanni, K. Beelen, Kasra Hosseini, Ruth Ahnert, J. Lawrence, Katherine McDonough, Giorgia Tolfo, D. C. Wilson and B. McGillivray. "Living Machines: A study of atypical animacy." In Proceedings of the 28th International Conference on Computational Linguistics (COLING2020). 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/work/323177af-6081-4e93-8aaf-7932ca4a390a
 
Title Living with Machines alpha and beta Zooniverse 'accident' task data 
Description Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Publishing the data is part of our contract with crowdsourcing participants, and provides evidence of our commitment to transparency and data sharing. 
URL https://doi.org/10.23636/1197
 
Title Living with Machines alpha and beta Zooniverse 'accident' task data 
Description Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/work/4d262a8a-255b-45a1-a0fe-dc4af48e9798
 
Title Mariona Coll-Ardanuy - Creation of toponym resolution datasets (ongoing). 
Description Creation of toponym resolution datasets: ~1000 newspaper articles manually annotated with mentions of places and their geographical coordinates. The annotations are not yet complete. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact Ongoing work. We aim at publishing the dataset as soon as the annotations are complete. They will serve to assess the performance of our toponym resolution method and will be a contribution to several fields, like geographic information retrieval, computational linguistics, and digital humanities. 
 
Title Mariona Coll-Ardanuy, Creation of a gazetteer for toponym resolution (ongoing). 
Description Creation of a gazetteer for toponym resolution (alpha version). This is a Wikipedia-based gazetteer, enriched with data from the geographical database Geonames. The alpha version of the code that creates the gazetteer has already been released (see URL below). This work is ongoing: we are working on enriching it with data from historical sources (maps and text). 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The gazetteer has not been made available, but publication and the code repository with the instructions on how to create it are publicly available. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places
 
Title May's British and Irish Press Guide and Advertiser's Handbook & Dictionary etc. (1871-1880) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price, publisher, office, political and religious leaning. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/0a4f3f09-11ff-4360-a73e-ce3a7654f14c
 
Title Neural Language Models for Nineteenth-Century English 
Description We present four types of neural language models trained on a large historical dataset of books in English, published between 1760 and 1900, and comprised of ˜5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings, and four instances considering different time slices for BERT. Our models have already been used in various downstream tasks where they consistently improved performance. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact Even though word2vec has been around for almost a decade-an eternity in the fast-moving NLP ecosystem-the word type embeddings it produces persist as popular instruments, especially for interdisciplinary research (Azarbonyad et al. 2017; Hengchen, Ros, & Marjanen, 2019). The more recent fastText model extends on word2vec by using subword information. Contextualized language models have meant a breakthrough in NLP research (e.g. Smith (2019) for an overview), as they represent words in the contexts in which they appear, instead of conflating all senses, one of the main criticisms of word type embeddings. The potential of using such models for historical research is immense as they allow a more accurate context-dependent representation of meaning. These embeddings can also be used in existing tools for historical research (e.g. Hosseini, Nanni, and Coll Ardanuy (2020)). Given that existing libraries, such as Gensim, Flair, or Hugging Face, provide convenient interfaces to work with these embeddings, we are confident that our historical models will serve the needs of a wide-variety of scholars, from NLP and data science to the humanities, for different tasks and research purposes, such as measuring how words change meaning over time (Kulkarni, Al-Rfou, Perozzi, & Skiena, 2015; Tahmasebi, Borin, & Jatowt, 2018), automatic OCR correction (Hämäläinen & Hengchen, 2019), interactive query expansion12 or, more generally, any research that involves diachronic language change. 
URL https://zenodo.org/record/4782245
 
Title Newspaper Directories digitised, OCRed, modelled and structured data extracted from Mitchell's directories (1846-1909) 
Description This collection includes a subset of Mitchel's Newspaper Press Directories which is annotated and structured for future incorporation in the Newspaper database. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The information extracted from the Press Directories will significantly contribute to enriching newspaper data received from Heritage Made Digital, FindMyPast and JISC. It will also contribute to the environmental scan project and paper. 
 
Title Ordnance Survey Old / First series England and Wales 1:63360 (georeferenced sheet images) 
Description Map sheet images for the Ordnance Survey Old Series / First Series England and Wales 1:63360, georeferenced and cropped at the neatlike (can be viewed together as a seamless composite). Geotiff format. The original (ungeoreferenced) sheet images can be found at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets). The sheets were georeferenced by relating the sheet corners to their coordinates (no internal control points applied), using sheet boundary data created by the Charles Close Society (see https://www.charlesclosesociety.org/KMLFILE). Where sheets were issued as quarter sheets (NW, NE, SW, SE), a digital composite of the full sheet has been created. The filenames include the sheet number. See an index map at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets)#/media/File:Ordnance_Survey_One-inch_Old_Series_England_&_Wales_Index.png The imagery is medium resolution. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact
URL https://bl.iro.bl.uk/concern/datasets/2fa13eb5-1767-469b-b4c0-d9d518bfc1b3#?c=0&m=0&s=0&cv=0&xywh=0%...
 
Title StopsGB: Structured Timeline of Passenger Stations in Great Britain 
Description Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and subsequently online as a PDF created from an underlying MS Word document), this resource was not well suited for systematic linking to other data. We now present a new, automatically generated dataset that provides the rich detail of this exceptional resource in a structured format. Each station described in the _Chronology_ is given certain attributes, such as operating companies and opening and closing dates, and is georeferenced and linked---whenever possible---to its corresponding entry on Wikidata. We name this structured, linked, and georeferenced dataset 'StopsGB' (Structured Timeline of Passenger Stations in Great Britain), and we make it openly available. We believe this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This is a new contribution. We expect that this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. 
URL https://bl.iro.bl.uk/concern/datasets/0abea1b1-2a43-4422-ba84-39b354c8bb09
 
Title Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' 
Description Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4034818
 
Title Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' 
Description Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4034819
 
Title The Newspaper Press Directory (1846-1880) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in district; title, price, day of publication; politics; date of first issue; political leanings and special interests; proprietors and publishers. Some overseas titles information also included in selected years. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/020c22c4-d1ee-4fca-bf75-0420fe59347a
 
Description Collaboration with the Estonian War Museum on a Europeana-funded project 
Organisation Europeana
Country Netherlands 
Sector Public 
PI Contribution I was invited to be a named researcher on a bid by the Estonian War Museum to run a workshop and pilot mini-crowdsourcing projects, funded by Europeana. I contributed to their survey design, and devised and ran 6 structured sessions within a 2 day workshop, designed to take organisations through the processes involved in planning a successful crowdsourcing project. The workshops included prompts for discussion across many departments and disciplines within an organisation, and concluded with group presentations of the ideas developed through the workshops.
Collaborator Contribution The project "Crowdsourcing for military heritage in Estonia" is funded by 9925 Euros on a period of January to June 2022. The Estonian War Museum leads the project, organising the workshops, running the survey and reporting on the results, and monitoring the five projects as they develop to June 2022.
Impact The final outputs will be five small-scale crowdsourcing projects by Estonian museums, a survey, and publications on the lessons the institutions running them learned from the research project.
Start Year 2021
 
Description Humphrey Southall (Vision of Britain) 
Organisation University of Southampton
Country United Kingdom 
Sector Academic/University 
PI Contribution Reuse of data and citation.
Collaborator Contribution Data sets shared in addition to those available for download on the Vision of Britain site, including a simplified data table.
Impact Data sharing.
Start Year 2019
 
Description Living with Machines and Find My Past 
Organisation Findmypast
Country United Kingdom 
Sector Private 
PI Contribution We will be sharing the methods and outcomes of our research on this data, for example OCR correction, and toponym resolution.
Collaborator Contribution FMP has shared newspaper data with Living with Machines for two counties (Lancashire and Dorset), and in the near future will be sharing all newspapers from Britain dating 1780-1920 that were digitised by FMP for the British Newspaper Archive. A member of FMP also sits on Living with Machines' Advisory Board.
Impact Findmypast has provided samples of the British Library's digitised Newspaper Collection and have advised us through their membership on Living with Machines Advisory Board. There are prospects of working together on OCR correction following the ingestion of other incoming full data-sets from the same collection.
Start Year 2018
 
Description Living with Machines and National Library of Scotland 
Organisation National Library of Scotland
Country United Kingdom 
Sector Academic/University 
PI Contribution Living with Machines initiated contact with Chris Fleet, map curator at the NLS to investigate access to their digitized map collections. K. McDonough and O.Vane have worked closely with Fleet over the last 9 months to share and evaluate the digital map holdings. We organized a workshop (June 2019) at the Turing/BL with Chris and other historical maps experts to explore best practices in working with large collections for a digital humanities project. We have shared back reflections and code for enriching the collection metadata, visualizing the collections, and have also developed a close working relationship that will continue to grow (through the sharing of additional maps and metadata as well as collaborative research into other ways of sharing digital map data to researchers through IIIF).
Collaborator Contribution NLS Maps curator Chris Fleet has shared a subset of the 200,000 digitized sheets, to be expanded on in the near future. He has provided extensive advice and support for working with the metadata, accessing versions of the maps as web map tiles, and thinking about the next steps of using these materials in a computational research environment. He has also been immensely helpful in connecting Living with Machines to the small, but growing community of researchers using machine learning methods with maps.
Impact Blog posts (Computational Approaches to Ordnance Survey Maps: Finding words in maps, part 2: seeing the results blog post); Talks (Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020; Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020, Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019; Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019; Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019); Workshops (Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020; and Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 ); and Meetings (Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019).
Start Year 2019
 
Title Branching sparklines / line graphs 
Description This notebook demonstrates the branching design used in Press Picker: an interactive visualisation tool for newspaper metadata at the British Library, created in the Living with Machines project. Press Picker shows the holdings per-year of different UK newspapers at the library, and their different formats. We used branching to communicate newspapers changing their name. Through history, newspapers sometimes change their name multiple times-particularly local papers. For example, The Athletic Reporter in 1886 becomes The Reporter, which in 1888 becomes The Midland Counties Reporter and General Advertiser, which in 1889 becomes The Reporter and General Advertiser, and so on. In the British Library data, a new name is treated as a wholly separate record. Introducing this branching means we bring together data that, to some extent, is referring to the same thing. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Impact
URL https://observablehq.com/@oliviafvane/branching-sparklines-line-graphs
 
Title DeezyMatch 
Description DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
URL https://zenodo.org/record/3983554
 
Title DeezyMatch 
Description DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
URL https://zenodo.org/record/3983555
 
Title Living-with-machines/hmd_newspaper_dl: Initial release 
Description This release is for a version of the code which works with the current version of the British Library Research Repository What's Changed update code to support new BL repo by @davanstrien in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/4 Bump addressable from 2.7.0 to 2.8.0 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Bump rexml from 3.2.4 to 3.2.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/7 Bump nokogiri from 1.11.0 to 1.12.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/6 New Contributors @dependabot made their first contribution in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Full Changelog: https://github.com/Living-with-machines/hmd_newspaper_dl/compare/v0.0.1...v0.0.2 
Type Of Technology Software 
Year Produced 2021 
Impact Code for bulk downloading newspaper datasets 
URL https://zenodo.org/record/5571839
 
Title MapReader 
Description MapReader is a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Further applications in a new domain: the Turing project Scivision is applying MapReader in a plant phenotyping task. 
URL https://github.com/Living-with-machines/MapReader
 
Title Observable notebook 'Heatmap for polygons' 
Description JavaScript Observable code notebook demonstrating a geospatial visualisation technique: "Visualise overlaps in a large polygon dataset: colourise-alpha using WebGL shaders + PIXI.js". The code notebook demonstrates the technique on historical maps data from National Library of Scotland. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Thanks from National Library of Scotland, whose data it is demonstrated on, who described the code as "really interesting and useful" for them. 
URL https://observablehq.com/@oliviafvane/heatmap-for-polygons
 
Title Press Picker: An interactive visualisation tool for newspaper metadata 
Description Press Picker was created to help select British Library newspaper titles for digitisation. Read more about the context in this blog post and see an interactive demo in this post. The tool provides an overview of newspaper holdings over time, their different formats (hardcopy or microfilm), and the relationship between titles connected by name changes. Titles can be selected within the interface and their data exported. We are sharing the code for reuse. Press Picker consists of two Python Jupyter notebooks. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Inquiry about reuse from Berlin State Library (Staatsbibliothek zu Berlin) 
URL https://github.com/Living-with-machines/PressPicker_public
 
Title alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo 
Description Small version bump with updated linguistic processing notebooks. 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/3610375
 
Title alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo 
Description Small version bump with updated linguistic processing notebooks. 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/3611200
 
Title davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 
Description An introduction to computer vision for working with maps: workshop at DHN 2020 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/4106323
 
Title davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 
Description An introduction to computer vision for working with maps: workshop at DHN 2020 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/4106322
 
Title deduplify - author Sarah Gibson 
Description deduplify is a Python command line tool that will search a directory tree for duplicated files and optionally remove them. It generates an MD5 hash for each file recursively under a target directory and identifies the filepaths that generate unique and duplicated hashes. When deleting duplicated files, it deletes those deepest in the directory tree first leaving the last present. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact The deduplify tool enables the deduplication of file records in messy datasets and has been used within the process of wrangling the JISC1 & JISC2 newspaper datasets into a form amenable to further processing. 
URL https://github.com/Living-with-machines/deduplify
 
Title defoe, the spark-based for analysing historical datasets 
Description This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations. The proposed tool has been successfully evaluated using five different large-scale historical text datasets and two HPC environments, as well as on desktops. Results shows that defoe allows researchers to query multiple datasets in parallel from a single command-line interface and in a consistent way, without any HPC environment-specific requirements. 
Type Of Technology Software 
Year Produced 2019 
Impact Originally developed by UCL and the British Library (funded by Jisc, 2015) then UCL (funded by 2016-2018), defoe was refactored and extended by EPCC, The University of Edinburgh for both Alan Turing Institute funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020) 
URL https://github.com/alan-turing-institute/defoe
 
Title defoe_visualization, a collection of notebooks for analysing further the results obtained by defoe 
Description defoe_visualization is a repository of Jupyter notebooks which complements the defoe scalable and portable digital eScience toolbox for historical research. These notebooks allow researchers to explore query results from defoe and to post-process the results to reveal new insights into the historical data processed by defoe. The notebooks are complemented with sample data files with the query results produced by the authors. 
Type Of Technology Software 
Year Produced 2019 
Impact Developed by EPCC, The University of Edinburgh in conjunction with: the Alan Turing Institute (2018-2019) funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020). 
URL https://github.com/alan-turing-institute/defoe_visualization
 
Title flyswot 
Description flyswot is a Command Line Tool that supports British Library staff in processing 'legacy' digitised content using computer vision. Flyswot is a command-line tool that can be run across images in a directory to check for incorrect metadata. Flyswot has the following features UNIX style search patterns for matching images to predict against produces a CSV output containing the paths to the input images, the predicted label and the models confidence for that prediction. produces a summary 'report' providing a high-level summary of the predictions made by flyswot automatically downloads the latest available flyswot model 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact The British Library holds a large amount of 'legacy' digitised material (~1 Petabyte). Some of these images have previously assigned uncorrected metadata as the result of limitations in a legacy digitised image platform. In particular images of manuscript pages were given the label 'flysheet' when other available labels weren't available. As a result, many images are falsely labelled as 'flysheets'. As part of the move to a new digital library system, there is a desire to correct this metadata. The scale of this problem makes fully manual intervention challenging. Flyswot, and the associated machine learning models, were developed in collaboration with the Heritage Made Digital team within the library to support library staff in processing this material. Flyswot is actively being used in this workflow and is helping speed up the process of checking images and helping assess the required work in processing collections. Beyond this, flyswot has also identified collection items that didn't have pagination and as a result curators have intervened not only in digital collections but also with the physical items. 
URL https://github.com/davanstrien/flyswot
 
Title jisc-wrangler - author Timothy Hobson 
Description jisc-wrangler is a Python tool written specifically to restructure and deduplicate XML files containing OCR content from the JISC 1 & JISC 2 newspaper dataset. It outputs a canonical file structure and filename convention amenable to further processing with the alto2txt tool. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This tool makes the JISC 1 & JISC 2 newspaper datasets accessible to the research project by cleaning, deduplicating and standardising the directory structure and filenames. It performs an essential pre-processing step that unlocks the potential of this open-access dataset. 
URL https://github.com/Living-with-machines/jisc-wrangler
 
Title subsamplr - author Timothy Hobson 
Description subsamplr is a Python tool for representative subsampling from a population. It was designed for sampling from a large collection of digital newspapers, but is a generic tool that could be applied in any context in which metadata is available for a population and a subsample is desired. Any features in the metadata can be used as dimensions for subsampling. The tool is configurable to connect to a metadata database and includes example Jupyter notebooks. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Many avenues of research within the Living with Machines project target historic newspaper data, and given the volume of data available, the first step is typically to sample from the various newspaper collections to produce an accessible subset of data on which research methodologies can be developed and tested. The subsamplr tool is designed for precisely this purpose and is therefore an important component in the research workflow across a wide variety of investigations in the project. It enables researchers to specify subsampling parameters from which data samples are (reproducibly) generated that satisfy the requirements of the particular research question at hand. 
URL https://github.com/Living-with-machines/subsamplr
 
Description "How we collaborate" blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blog post series reflecting on our experience of collaborating on the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/how-we-collaborate/
 
Description "Introducing the Language Lab" blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blogpost introducing the language lab, which explored the social and cultural impact of the Industrial Revolution as reported in newspapers and other types of textual sources.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-language-lab/
 
Description "Introducing..." blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact We published a series of blog posts introducing each member of the Living with Machines team
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/the-team/
 
Description 'Data visualisation for cultural heritage collections' course at N8 Centre of Excellence in Computationally Intensive Research 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact Olivia Vane delivered a two-part workshop on data visualisation for Digital Humanities. Split over two sessions, the workshops gave an overview of the key concepts in data visualisation, before moving to tackle more practical exercises in the second week.
Year(s) Of Engagement Activity 2021
URL https://n8cir.org.uk/events/data-visualisation-hums/
 
Description 124 Introduction to OCR and HTR 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Daniel van Strien presented as part of a British Library staff training workshop on OCR (Optical Character Reccongition)
Year(s) Of Engagement Activity 2020
 
Description ACH talk: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk for the Association for Computers and the Humanities ACH2021 conference in July 2021, presented by Mia and based on her work with Barbara McGillivray, Giorgia Tolfo, Emma Griffin and others in the project.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati...
 
Description AI4LAM presentation: AI training resources for GLAM 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation introducing an "AI training resources for GLAM review" document. The presentation took place as part of an AI4LAM community call (https://sites.google.com/view/ai4lam)
Year(s) Of Engagement Activity 2021
URL https://docs.google.com/document/d/1l4KFhAX1nijBUmE5Srfcq2ELFvrYbm8fp3jaszsmiAE/edit?usp=sharing
 
Description An introduction to computer vision for working with digitised heritage collections (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A workshop with around ~50 participants introducing deep learning-based computer vision methods to digital humanities researchers and heritage professionals.
Year(s) Of Engagement Activity 2020
URL https://github.com/Living-with-machines/computer-vision-DHNordic-2020-workshop
 
Description Andre Piza presented at "Future of Journalism" to Open Society Foundation Journalism Programme 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation about Living with Machines project started dialogue with BBC News Labs and Open Society leading to talk from BBC News Labs Executive Product Manager (David CAswell) at the Alan Turing Institute and visit from Open Society's Independent Journalism Senior Programme Specialist (Shuwei Fang). Opportunities for collaboration with LWM are now being explored with BBC News Labs.
Year(s) Of Engagement Activity 2019
 
Description Annotation session with the British Library staff, 2 August 2019, organised by Daniel van Strien, Mariona Coll Ardanuy, and Mia Ridge 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We had an open annotation session in which we invited British Library staff members to help with our experiments. We planned four different linguistic annotation tasks (named entity recognition, recognition of machines, entity linking to Wikipedia, and semantic role labeling) on newspaper articles from the nineteenth century.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Association for Computers and the Humanities paper presentation: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A presentation for the annual Association for Computers and the Humanities conference that explicitly addressed the challenges of embedding crowdsourcing as a form of public engagement into a 'data science' research project with different conceptions of timelines, metrics for success, etc.
Year(s) Of Engagement Activity 2021
URL https://ach2021.ach.org/
 
Description Beta Test of Library Carpentry Introduction to AI and Machine Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop hosted by LIBER/BNF. Daniel van Strien contributed towards a beta test of a lesson that is currently in the early stages of development and is to become a part of the Library Carpentry Curriculum.
Year(s) Of Engagement Activity 2021
URL https://libereurope.eu/mec-events/beta-test-of-library-carpentry-introduction-to-ai-and-machine-lear...
 
Description Blog Post 'Heatmap for polygons: visualise overlaps in a large polygon dataset' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact A technical 'how to' blog post on the Living with Machines website, describing a geospatial visualisation technique. National Library of Scotland (whose data the post demonstrates the technique on) and Registers of Scotland both fed back that the blog post was helpful and interesting.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/heatmap-for-polygons-visualise-overlaps-in-a-large-polygon-dataset/
 
Description Blog Post 'Press Picker code published' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact "We are very pleased to share the code for 'Press Picker', our interactive data visualisation tool for newspaper metadata: https://github.com/Living-with-machines/PressPicker_public."
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/press-picker-code-published/
 
Description Blog Post on Sources Lab (Understanding the Victorian Newspaper Landscape) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Blog post describing the work of the Source Lab on Digitizing and processing the Newspaper Press Directories.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/sources-understanding-the-victorian-newspaper-landscape/
 
Description Blog post 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Blog post on the Living with Machines website: 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings'.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title-name-changes-in-the-brit...
 
Description Blog post: "Finding your way among newspapers" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Blog post "Finding your way among newspapers" on how to select newspapers for digitisation at the British Library.
Year(s) Of Engagement Activity 2020
URL http://livingwithmachines.ac.uk/finding-your-way-among-newspapers/
 
Description Blog post: 'Platforms for People-Powered Research' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post highlighting contributions to a conference and sharing a video from the panel discussion.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/platforms-for-people-powered-research/
 
Description Blog post: Ad or not? New crowdsourcing task 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post describing a new crowdsourcing task that aimed to make data from a previous task easier to analyse by classifying articles as being advertisements or not.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/ad-or-not-new-crowdsourcing-task/
 
Description Blog post: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post highlighting our contribution to a panel at the Association for Computing in the Humanities conference.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati...
 
Description Blog post: Exploring ideas for our Living with Machines exhibition 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post setting out exhibition themes and introducing our collaboration with Leeds Museums and Galleries.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/exploring-ideas-for-our-living-with-machines-exhibition/
 
Description Blog post: First crowdsourced datasets available 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A post in support of the first open data release from crowdsourcing activities on the project, linking to the British Library's research repository.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/first-crowdsourced-datasets-available/
 
Description Blog post: From prams to Parliament - what was a machine? Help us find out 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/from-prams-to-parliament-what-was-a-machine-help-us-find-out/
 
Description Blog post: Highlights from crowdsourcing projects at the British Library 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The post provided progress reports on a range of crowdsourcing projects at the British Library, including the Zooniverse tasks created through Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://blogs.bl.uk/digital-scholarship/2020/12/highlights-from-crowdsourcing-projects-at-the-britis...
 
Description Blog post: Learning from Zooniverse volunteers to improve crowdsourcing projects 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post that describes how feedback from volunteers led to improvements in our crowdsourcing task launched in December 2020.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/learning-from-zooniverse-volunteers-to-improve-crowdsourcing-projec...
 
Description Blog post: Sharing our Delivery Plan 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The post celebrated the deposition of our 2019 Delivery Plan in the British Library's repository. Sharing it was part of our commitment to transparency, and to sharing our lessons learnt as we ourselves learn them.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/sharing-our-delivery-plan/
 
Description Blog post: What does a 'digital humanities research software engineer' do? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A conversation between Mia and Olivia Vane designed to broaden the reach and demonstrate the range of experience, skills and job titles relevant to our job advertisement replacing Olivia as DH RSE. When we interviewed for the post, we learnt that this post was pivotal in the successful applicant deciding to apply for the role.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/what-does-a-digital-humanities-research-software-engineer-do/
 
Description Blog post: What is a 'machine' anyway? Help us describe them 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/what-is-a-machine-anyway-help-us-find-out/
 
Description British Library Open House Session at Boston Spa 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Open House Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Show and Tell Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact An interactive poster session about the various tasks and outcomes of the Projects Labs, attended by staff across the British Library and Alan Turing Institute.
Year(s) Of Engagement Activity 2019
 
Description British Library project web page 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Created a project web page on the British Library website to provide official visible information about the project in support of our other engagement activities.
Year(s) Of Engagement Activity 2021
URL https://www.bl.uk/projects/collective-wisdom
 
Description Cambridge GLAM Digital champions lightning talk "The Living with machines project" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact I presented the Living with machines project to an audience of librarians and other professionals from the GLAM (Galleries, Libraries, Archives, Museums) sector.
Year(s) Of Engagement Activity 2020
URL https://www.eventbrite.co.uk/e/glam-digital-champions-digital-lunch-january-2020-tickets-89946158381...
 
Description Case Study OED API: Exploring word meaning in historical texts with computational methods 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post for the OED case studies series
Year(s) Of Engagement Activity 2021
URL https://public.oed.com/blog/case-study-oed-api/
 
Description Catching up with maps 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post on the Living with Machines website to provide a high-level update on the maps-related work in the project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/catching-up-with-maps/
 
Description Code and Coffee ?? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post describing an internal project activity aimed at facilitating collaboration
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/code-and-coffee/
 
Description Collecting annotations from British Library staff 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an event held with British Library staff
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Computational Approaches to Ordnance Survey Maps blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This blog post introduces the preliminary work of the "Space & Time Lab" in Living with Machines, which experimented with computer vision methods for studying large sets of historical, digitized maps. With 179 page views, it generated several conversations with external researchers about our use of these methods in the humanities context.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-space-and-time-lab/
 
Description Computer Vision for Digital Heritage SIG 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post on the Living with Machines website announcing the new Computer Vision for Digital Heritage SIG.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/computer-vision-for-digital-heritage/
 
Description Computer Vision for the Humanities workshop (Warwick University) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This workshop aims to provide an introduction to computer vision aimed for humanities applications. In particular this workshop focuses on providing a high level overivew of machine learning based approaches to computer vision focusing on supervised learning. The workshop includes discussion on working with historical data. The materials are based on in progress Programming Historian lessons.
Year(s) Of Engagement Activity 2021
URL https://zenodo.org/record/4746493
 
Description Conference Roundtable: The Future of Spatial History for Spatial Humanities 2021/DHangouts 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Roundtable discussion on the future of spatial history with K. McDonough, J. Taylor, and L. Scholz, chaired by I. Gregory for the Spatial Humanities 2021 conference and presented as part of the DHangout series hosted by Lancaster University. Audience of about 35 people with conversation about the future of computational spatial historical research.
Year(s) Of Engagement Activity 2021
URL https://youtu.be/60aT8J4hMAA
 
Description Course 107 'Data Visualisation for Cultural Heritage Collections': British Library Digital Scholarship Training Programme 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Third sector organisations
Results and Impact In May 2020, Olivia Vane taught the rebooted Course 107 'Data Visualisation for Cultural Heritage Collections' for the British Library Digital Scholarship Training Programme: internal training in digital methods for British Library staff. The course was delivered over 2 sessions (4.5hrs in total) and included presentations and exercises with British Library datasets. It was taught over Zoom + Slack.
Year(s) Of Engagement Activity 2020
 
Description Crowdsourcing tasks 'What's that machine? Describe it!' and 'What's that machine? Classify it!' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Building on the lessons learnt from earlier experiments, in early December we launched two new crowdsourcing projects with devised in collaboration with researchers including historians and computational linguists. These projects aimed to integrate linguistic research questions with tasks that encouraged volunteers to engage with social and technological history in the pages of 19th century newspapers.

As part of the launch process we applied to become an official Zooniverse project, which included separate reviews by Zooniverse staff and volunteers. We tweaked the interfaces as a result, and were delighted to be recognised as an official Zooniverse project.

Nearly 10,000 tasks were completed by over 700 registered volunteers (and countless anonymous volunteers) within a week.
Year(s) Of Engagement Activity 2020
URL https://www.zooniverse.org/projects/bldigital/
 
Description D3 JavaScript visualisation in a Python Jupyter notebook 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post describing how to combine JavaScript, the visualisation library D3.js and Python Jupyter notebooks. Accompanying notebook code was published with this blogpost.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/d3-javascript-visualisation-in-a-python-jupyter-notebook/
 
Description Daniel Van Strien: Flyswot: garden-variety machine learning applications conference presentation 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Conference presentation "Flyswot: garden-variety machine learning applications" at the ai4lam conference. Presenters: Daniel van Strien, Digital Curator at the British Library, Andrew Longworth, Digitisation Project Analyst at the British Library, Catherine Cronin, The Heritage Made Digital Team at the British Library
Year(s) Of Engagement Activity 2021
URL https://www.bnf.fr/en/program-international-conference-les-futurs-fantastiques-december-8-10-2021
 
Description Daniel van Strien The Carpentries: Introduction to AI for GLAM Workshop at AI4LAM conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Deliver of 'The Carpentries: Introduction to AI for GLAM" workshop online as part of the AI4LAM conference.
Year(s) Of Engagement Activity 2021
URL https://www.bnf.fr/en/agendaEN/workshops-tutorials-les-futurs-fantastiques-3rd-conference-about-arti...
 
Description Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact A workshop held for British Library staff on using Computer Vision methods with heritage data including historic map collections.
Year(s) Of Engagement Activity 2020
 
Description Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop on using Computer Vision methods with historical collections held at the Create centre in Amsterdam University.
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/events/digital-history-workshop-maps-as-data-analysing-historic...
 
Description Daniel van Strien, Katherine McDonough, Daniel Wilson presented at Victorian Data Conference, University of Virginia, November 15-16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Three Living with Machines members presented on a session about "Living with Bias" at the Victorian Data conference, the first gathering of nineteenth-century studies scholars using digital methods in their work. Attended by about 100 researchers, our presentation both introduced Living with Machines to this largely US-based audience and generated several connections which have already resulted in visits to the Turing/BL in London in 2020 (including the faculty director of the University of Virginia Scholar's Lab, Alison Booth, who was a co-host of this conference).
Year(s) Of Engagement Activity 2019
URL http://data-caucus.herokuapp.com/conference-cfp
 
Description Data Study Group on smart monitoring for conservation areas 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Data Study Groups (DSG) are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Kasra Hosseini and Mariona Coll Ardanuy were the principal investigators of a DSG with the World Wide Fund for Nature (WWF) on "Smart monitoring for conservation areas". The methods explored are closely related to methods directly applicable to Living with Machines datasets.
Year(s) Of Engagement Activity 2019
URL https://www.turing.ac.uk/research/publications/data-study-group-final-report-wwf
 
Description David Beavan and James Hetherington contributing to Royal Society 'Dynamics of data science skills' 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Contribution to report - see link.
Year(s) Of Engagement Activity 2019
URL https://royalsociety.org/topics-policy/projects/dynamics-of-data-science/
 
Description David Beavan invited 'floating expert' and Mia Ridge, Dr. Katherine McDonough, Dr. Kaspar Beelen and Dr. Kasra Hosseini (project collaborator) invited participants at Computational Archival Science Workshop: Exploring Data, Investigating Methodologies, The National Archives, 20-21 June 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact About 100 people attended this event where Kaspar Beelen and Katie McDonough presented the keynote lecture on bias in digitized archival collections being used in the Living with Machines project. The international audience included GLAM professions and students from the US, UK, and elsewhere in Europe, and fostered conversations about the role of GLAM institutions in collaborating with researchers to develop best practices for creating, preserving, and making accessible digitised and born digital collections.
Year(s) Of Engagement Activity 2020
URL https://blog.nationalarchives.gov.uk/computational-archival-science-cas-exploring-data-investigating...
 
Description David Beavan invited presentation at Software Development in Digital Humanities Labs and Projects, University of Sussex, 30 July 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description David Beavan invited talk at National library of Scotland Focused tech development delivering enhanced collections data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact David Beavan invited talk given to National Library of Scotland (NLS) internal professional seminar series
Year(s) Of Engagement Activity 2020
 
Description David Beavan led, Mia Ridge, Barbara McGillivray participated in panel discussion 'Data Science & Digital Humanities: new collaborations, new opportunities and new complexities' at Digital Humanities 2019 conference, Utrecht, July 11, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This panel highlights the emerging collaborations and opportunities between the fields of Digital Humanities (DH), Data Science (DS) and Artificial Intelligence (AI). It charts the enthusiastic progress of the Alan Turing Institute, the UK national institute for data science and artificial intelligence, as it engages with cultural heritage institutions and academics from arts, humanities and social sciences disciplines. We discuss the exciting work and learnings from various new activities, across a number of high-profile institutions. As these initiatives push the intellectual and computational boundaries, the panel considers both the gains, benefits, and complexities encountered. The panel latterly turns towards the future of such interdisciplinary working, considering how DS & DH collaborations can grow, with a view towards a manifesto. As Data Science grows globally, this panel session will stimulate new discussion and direction, to help ensure the fields grow together and arts & humanities remain a strong focus of DS & AI. Also so DH methods and practices continue to benefit from new developments in DS which will enable future research avenues and questions.
Year(s) Of Engagement Activity 2019
URL https://dev.clariah.nl/files/dh2019/boa/0364.html
 
Description David Beavan presented at Turing Innovation Symposium, hosted by Accenture, Dublin, 3-4 April 2019. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview of Living with Machines for Turing Innovation Showcase in Dublin 2019.
Year(s) Of Engagement Activity 2019
 
Description David Beavan presented talk 'Potential Uses of a Registry of Digitised Works: By scholars' at Global Digitised Dataset Network, British Library, 10 June 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Lessons from the project on uses of a registry of digitised works
Year(s) Of Engagement Activity 2019
URL https://gddnetwork.arts.gla.ac.uk/
 
Description Deep Learning approaches in GIScience session at the Royal Geographical Society Annual Conference: Maps and Machines presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation about computer vision for maps research at the annual Royal Geographical Society conference. Virtual audience of about 30 people.
Year(s) Of Engagement Activity 2021
URL https://sdesabbata.github.io/deep-learning-giscience/
 
Description Deep learning reading group 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing an internal reading group on deep-learning methods being used by the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/deep-learning-reading-group/
 
Description Developing Data Study Group with TNA on (web) archives and social attitudes towards new technologies, initiated by Barbara McGillivray and David Beavan 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Data Study Groups are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Beavan and McGillivray co-organised a DSG with the National Archives on "Discovering topics and trends in the UK Government Web Archive"
Year(s) Of Engagement Activity 2019
URL https://www.turing.ac.uk/events/data-study-group-december-2019
 
Description Did Machines Drive History? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing the first minimum research outcome of the language lab, in which we explored to what extent machines were being seen as agents able to drive change.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/did-machines-drive-history/
 
Description Digital Humanities and Research Software Engineering working together: some examples of a fruitful collaboration from the Living with Machines project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Federico Nanni and Kasra Hosseini (from the Research Engineering group at the Alan Turing Institute) and Kaspar Beelen and Mariona Coll Ardanuy (postdocs in the Living with Machines project) shared their experience in working together in projects at the intersection of software engineering, computational linguistics and digital humanities, as part of the KQ Codes Technical Socials at University College London. About 20 participants attended.
Year(s) Of Engagement Activity 2021
URL https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials
 
Description Digital Humanities at Oxford Summer School Virtual Event 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Interactive workshop on "An introduction to natural language processing with Python", organised by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. Participants learned how to use Python programming for powerful text processing in the Humanities, from cleaning texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. There were 60 attendants.
Year(s) Of Engagement Activity 2020
URL https://www.dhoxss.net/dhox2020-virtual-event-report
 
Description Digital Humanities at Oxford Summer School Virtual Event 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "From Text to Tech: Text mining and the humanities, using language models to find living machines in nineteenth-century books" at the 2020 virtual edition of Digital Humanities at Oxford Summer School, with 270 attendants.
Year(s) Of Engagement Activity 2020
URL https://www.dhoxss.net/dhox2020-virtual-event-report
 
Description Digital Humanities at Oxford Summer School Virtual Event 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "Models of Language: Using algorithms to explore the past" at the 2021 virtual edition of Digital Humanities at Oxford Summer School. There were 450 participants.
Year(s) Of Engagement Activity 2021
URL https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school
 
Description Digital Humanities at Oxford Summer School Virtual Event 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Interactive workshop on "Language models and their use in the digital humanities", by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. This workshop offered a basic introduction to language models using python. Participants learned how to use and interpret different language models and to train their own models. There were 16 participants.
Year(s) Of Engagement Activity 2021
URL https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school
 
Description Emma Griffin invited presentation: International symposium - 'Dartmouth and the World', Dartmouth University, 10-20 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Gave talk on "Life and Living Standards in Britain's Industrial Revolution"
Year(s) Of Engagement Activity 2019
 
Description Emma Griffin invited presentation: Oregon State University, Centre for the Humanities, 7 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk on "Home Economics: Food, Money, and Emotions in Victorian Britain"
Year(s) Of Engagement Activity 2019
 
Description Finding words in maps, part 2: seeing the results 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post about evaluating the 'Strabo' tool (software for transcribing text in digitised historical maps) on our map data through visualisation.
Year(s) Of Engagement Activity 2019
URL https://livingwithmachines.ac.uk/finding-words-in-maps-part-2-seeing-the-results/
 
Description Free Thinking: Archiving, curating and digging for data 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact BBC Radio 3 broadcast:

What stories are being uncovered by people working behind the scenes at museums and institutions? Lisa Mullen finds out talking to Tessa Jackson - Conservator;
David Beavan - Senior Research Software Engineer, Turing Institute and Matt Harle - Archivist and curator at the Barbican.

Barbara Hepworth: Art & Life runs at the Hepworth Wakefield from 21 May 2021 to 27 Feb 2022. The gallery also runs a Hepworth Research Network in partnership with the Department of History of Art at the University of York and the School of Art, Design and Architecture at the University of Huddersfield.
https://hepworthwakefield.org/our-story/hepworth-research-network/people/

Matthew Harle is an archivist working with the Barbican as it prepares for its 40th anniversary so is assembling an archive alongside the Guildhall School of Music and Drama
https://www.barbican.org.uk/our-story/our-archive/about-the-archive
https://matthewharle.com/Barbican-Archive

The Alan Turing Institute https://www.turing.ac.uk/ is the national institute for data science and artificial intelligence running a host of research projects into topics including AI, Public Policy and Living with Machines - a project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution.
https://livingwithmachines.ac.uk You can hear more from historian Emma Griffin in this conversation about Understanding the Industrial Revolution https://www.bbc.co.uk/programmes/p081y7h4
Year(s) Of Engagement Activity 2021
URL https://www.bbc.co.uk/programmes/m000vydf
 
Description Giorgia Tolfo & Timothy Hobson online poster presentation at Data for History (June 2021): Modelling Time, Places, Agents (Berlin) entitled "Supporting an interdisciplinary research agenda through meta-modelling. The case of LwM" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A poster was presented for discussion with academic conference attendees on the subject of the "meta-modelling" approach taken to conceptual data modelling within the LwM project.
Year(s) Of Engagement Activity 2021
 
Description Hacking 23 years of government history: An example from The UK Government Web Archive 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Turing blog:

Web archives provide a key resource for the public. They allow us to access a wide range of data reflecting all areas of a society but, as they are large and meticulously maintained datasets, they can be daunting and difficult to navigate.

The Alan Turing Institute and The National Archives co-organised a Data Study Group challenge. Data Study Groups (DSGs) are events hosted by the Turing, which bring together some of the top talent from data science, artificial intelligence, and wider fields from across the world, to analyse real-world data science challenges.

The culmination of that work is now available to read via the published Data Study Group report 'Discovering topics and trends in the UK government web archive'
Year(s) Of Engagement Activity 2021
URL https://www.turing.ac.uk/blog/hacking-23-years-government-history-example-uk-government-web-archive
 
Description Historical Hypothesis Generation (BlogPost) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Long blog post outlining an element of our interdisciplinary method.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/historical-hypothesis-generation-hypgen/
 
Description Hunting for Treasure: Living with Machines and the British Library Newspaper Collection. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation "Hunting for Treasure: Living with Machines and the British Library Newspaper Collection." at the Impresso Workshop in Lausanne (held online)
Year(s) Of Engagement Activity 2020
URL https://impresso.github.io/eldorado/online-program/
 
Description IIIF conference lightning talk: IIIF and machine learning inference: a love story? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Lightning talk as part of the IIIF conference discussing the use of IIIF and compute3r vision to work with a Library of Congress collection of digitised newspapers.
Year(s) Of Engagement Activity 2021
URL https://iiif.io/event/2021/annual_conference/
 
Description Implications of AI for Libraries presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A presentation as part of a post-graduate library science talk on the implications of AI drawing examples for the Living with Machines project.
Year(s) Of Engagement Activity 2020
 
Description Information+ Conference talk: Olivia Vane, Kasra Hosseini, Katherine McDonough and Daniel CS Wilson - 'Maps in Time: Visualising the historical Ordnance Survey' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A division is often made between maps and timelines. This presentation from the Living with Machines project explores combining the two, visualising a dataset of 130,000 maps from the early Ordnance Survey (OS), Britain's national mapping agency. It was the OS who, from the early 19th century, created the first comprehensive, detailed and accurate picture of Great Britain. We show how animated data graphics can bring the story of the maps to life for a popular audience. We also visualise the data by space and time to support analysis in research.
Year(s) Of Engagement Activity 2021
URL https://vimeo.com/598429189
 
Description Intro to D3 session for Alan Turing Institute REG 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Teaching an 'Introduction to D3.js' for the Alan Turing Institute Research Engineering Group lunchtime tech talks. 2hr session: presentation and going through tutorials.
Year(s) Of Engagement Activity 2020
 
Description Introduction to Computer Vision for Digital Heritage using Living with Machines research 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Presentation as a part of the day-long conference organized by Polly Hudson to review Colouring London and related research for applications with Historic England and adjacent agencies.
Year(s) Of Engagement Activity 2021
URL https://colouringlondon.org/
 
Description Introduction to Jupyter Notebooks: the weird and the wonderful 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact An online workshop focused on potential uses of Jupyter Notebooks in a GLAM (Galleries, Libraries, Archives and Museums) settings.
Year(s) Of Engagement Activity 2021
URL https://github.com/Living-with-machines/Jupyter-Notebooks-The-Weird-and-Wonderful
 
Description Introduction to Python, with Mariona Coll Ardanuy, July 19th 2019, organised by Mariona Coll Ardanuy for Turing Community 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact 4-hour introductory course to programming for the Humanities, with a focus to text processing and data wrangling (e.g. opening and working with documents and file paths). The feedback was very positive. Participants got acquainted with the basics of Python programming, which they have been able to apply to the project in multiple occasions.
Year(s) Of Engagement Activity 2019
 
Description Invited talk on Computer Vision research in LwM for the Association of Geographic Information-Scotland. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact About 150 people attended a talk about Living with Machines research with historical maps.
Year(s) Of Engagement Activity 2021
 
Description Invited talk, Princeton University, 'Crowdsourcing and the Humanities' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk and panel discussion for an event with the Center for Research Data and Digital Scholarship at University of Pennsylvania Libraries, The Center for Digital Humanities at Princeton University Library, the Princeton Geniza Lab, and the Zooniverse, attended by c40 people. The panel and event sparked extended discussion on social media.
Year(s) Of Engagement Activity 2021
URL https://genizalab.princeton.edu/crowdsourcing-and-the-humanities
 
Description Invited talk: Crowdsourcing in cultural heritage lecture for Institut für Kunstgeschichte, LMU München 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact An invited talk for a German seminar group.
Year(s) Of Engagement Activity 2021
 
Description Invited talk: User Experience (UX) for Citizen Science , iDigBio event 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited to speak at the event 'Biodiversity Digitization: Celebrating a decade of progress' in the session 'Innovations: Strategy & Coordination'. My talk outlined the importance of user experience design (UX) for increasing diverse participation in citizen science projects.
Year(s) Of Engagement Activity 2021
URL https://www.idigbio.org/wiki/index.php/Biodiversity_Digitization:_Celebrating_a_decade_of_progress
 
Description Jon Lawrence, Inter-Disciplinary Research Programme Assessor for British Academy - 'The Humanities and Social Sciences Tackling the UK's International Challenges' (2019) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Assessing projects under the heading " The Humanities and Social Sciences Tackling the UK's International Challenges"
Year(s) Of Engagement Activity 2019
URL https://www.thebritishacademy.ac.uk/programmes/tackling-uk-international-challenges
 
Description K. Beelen, K. McDonough, "Maps and Machines: using computer vision to analyze the geography of industrial change (1790-1920)", University of Aberdeen DH Seminar, 26 Oct 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presentation of maps research to DH community at the University of Aberdeen.
Year(s) Of Engagement Activity 2021
 
Description K. Beelen, M. Coll Ardanuy and F. Nanni: "Breaking (the?) news in the nineteenth century", Knowledge, Information and Data Science (KIDS) group, University Collect London (UCL), London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers.
Year(s) Of Engagement Activity 2022
 
Description K. Beelen, M. Coll Ardanuy and F. Nanni: "Living with Machines: Analysing Digital Heritage at Scale", Digital Humanities Lab Exeter, University of Exeter 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers.
Year(s) Of Engagement Activity 2022
URL https://www.exeter.ac.uk/news/events/details/index.php?event=11894
 
Description K. McDonough, "DH Careers: Beyond the Professoriate," CESTA, Stanford University, 15 Feb. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact 30 PhD students at Stanford University attended this workshop to discuss career opportunities in the digital humanities.
Year(s) Of Engagement Activity 2022
URL https://cesta.stanford.edu/events/dh-careers-beyond-professoriate
 
Description K. McDonough, "Maps as [Open] [Humanities] Data: From Access to Analysis," Reimagining Industry/Academic/Cultural Heritage Partnerships in AI Workshop, AEOLIAN Network (Artificial Intelligence for Cultural Organisations), [virtual] 25 Oct. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation to international audience about the maps research in Living with Machines, in particular the issues around ethical use of heritage resources in digital research.
Year(s) Of Engagement Activity 2021
URL https://www.aeolian-network.net/events/workshop-2/
 
Description K. McDonough, K. Hosseini, "Maps as Data" for Turing Catch Up Monthly Meeting, Jan 24 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Lightning talk about Maps research within Living with Machines during the monthly Turing Catch Up. Resulted in several inquiries about new applications, further research with MapReader.
Year(s) Of Engagement Activity 2022
 
Description Kaspar Beelen "Surveying the Newspaper Landscape" (CREATE Salon, February) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at the University of Amsterdam attended by ca. 20 people. It was part of the "Salon" series organized by CREATE Amsterdam (Julia Noordegraaf).
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/
 
Description Kaspar Beelen Presentation for the British Library News Collection Group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presentation on the digitization of the Newspaper Press Directories and how this feeds into understanding the shape and contours of digital newspaper collections.
Year(s) Of Engagement Activity 2020
 
Description Kaspar Beelen and Katherine McDonough Keynote presentation the Computational Archival Science symposium "Surveying the Land" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Keynote presentation by Kaspar Beelen and Katherine McDonough at the Computational Archival Science Symposium, organized at the Alan Turing Insitute (January 2020).
Year(s) Of Engagement Activity 2020
URL https://www.turing.ac.uk/events/computational-archival-science-cas-symposium
 
Description Kaspar Beelen, Invited talk "Stereotypes in Newspaper data" at the Dutch National Library Research Week, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Presentation at the Dutch Royal Library (KB) to report on the progress of my Research in Residence programme. It was part of the KB "Research Week" and was the most popular in terms of people signing up.
Year(s) Of Engagement Activity 2019
 
Description Kaspar Beelen, Panel discussion on Coding Literacy in the Digital Humanities, at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Participation in a round table on the topic of "Coding Literacy in the Humanities" (organized by Marijn Koolen, Liliana Melgar and Mari Wigham). The round table included a presentation with different experts (Joris van Zundert, Elli Bleeker, Sally Chambers) and discussion with an audience of Digital Humanities experts.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2020/01/DH_Benelux_2019_paper_25.pdf
 
Description Kaspar Beelen, Presentation on "Bias in the British Newspaper Archive" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact 15 minutes paper presentation on the work the emerged out of the Sources Lab, focussed on understanding the newspaper landscape.. Attended by ca. 25 people, from various backgrounds (DH researchers, librarians,)
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2019/08/DH_Benelux_2019_paper_33.pdf
 
Description Kaspar Beelen, Presentation on "The Agency of Machines" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presentation reporting on the "The Agency of Machines" at the poster session of Digital Humanities Benelux, 2019. It involved discussion with many interested attendants of the conference.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/program/
 
Description Kaspar Beelen, Seminar on History and Text, Antwerp University, November 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact Presentation on the use of Text Mining for History. Part of the course "History and Language" (BA2) organised by Marnix Beyen (University of Antwerp).
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Presentation to about 40 people at the DH Seminar at Exeter was a great opportunity to make contact with the expert community there and introduce them to our ongoing work.
Year(s) Of Engagement Activity 2019
URL http://www.exeter.ac.uk/news/events/details/index.php?event=9637
 
Description Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Conversation to plan for future collaboration with researchers working at the cutting edge of computer vision for historical maps in the United States.
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited talk to review applications of research on qualitative spatial relations in the Living with Machines project. Question session offered an opportunity to learn about related research in the UK and to share our ongoing work with leaders in the field.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/spatial-relationships-in-text-as-data-tickets-76259685773
 
Description Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 60 people attended a presentation at Stanford University about Living with Machines. This conversation has created substantive links to the DH community at Stanford and there is continued interest in collaborating with us in the future.
Year(s) Of Engagement Activity 2019
URL https://cesta.stanford.edu/events/cesta-seminar-dr-katie-mcdonough
 
Description Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented Living with Machines research on computer vision with maps during a roundtable on applications of AI in GLAM institutions, generating conversation with an international audience about working with visual heritage materials at scale. The workshop offered GLAM staff, researchers, and policy leaders an opportunity for hands-on experience in computer vision, which has translated into invitations for collaboration and additional teaching opportunities.
Year(s) Of Engagement Activity 2019
URL https://fantasticfutures.stanford.edu/
 
Description Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Delivered a talk about using computer vision techniques to analyse digitised historical maps at scale.
Year(s) Of Engagement Activity 2020
 
Description LWM listed at Genealogy Stories "10 Websites for the History of Ordinary People" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Article in Medium listing good websites the public to find out more about the history of ordinary people included Living with Machines as a recommended source.
Year(s) Of Engagement Activity 2021
URL https://genealogystoriesuk.medium.com/10-websites-for-the-history-of-ordinary-people-9ecc8b1b4832
 
Description Lab Talk for Workshop on Visualization for the Digital Humanities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Olivia Vane gave a lightning talk at the online 5th Workshop on Visualization for the Digital Humanities about the British Library Digital Scholarship department.
Year(s) Of Engagement Activity 2020
URL http://vis4dh.org/
 
Description Library Carpentry session 1 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 1 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 2 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 2 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 3 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 3 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 4 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 4 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Linking Geo-Data through Test and Play 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Tutorial on DeezyMatch, troubleshooting session, and final roundtable to discuss the tools useful in linking geospatial data from historical sources.
Year(s) Of Engagement Activity 2020
URL https://github.com/LinkedPasts/LaNC-workshop
 
Description Living with Machines OCR hack ? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an internal 'hack' event focused on OCR.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/living-with-machines-ocr-hack/
 
Description Maps as Data: A Humanistic Approach to Computer Vision for Large Map Collections 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation for the Unlocking Historical Maps of Southeast Asia Webinar Series, organized by Jane Jacobs at Yale-NUS in Singapore. The virtual workshop session was attended by 55 students, scholars, and librarians who are developing projects that use computational methods to study digitised map collections.
Year(s) Of Engagement Activity 2020
URL https://historicmapssea.commons.yale-nus.edu.sg/unlocking/
 
Description Mariona Coll-Ardanuy, Presentation at CogSci seminar at QMUL (13/06/2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk at the Cognitive Science group at Queen Mary University of London, presenting preliminary research on the language lab work for Living with Machines. There were very relevant comments, and interesting questions as well. A subsequent talk at the Cognitive Science seminar was planned, which will take place on 25 May 2020.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge and Andre Piza, invited participants at AI and Storytelling workshop, Kings Digital Lab, Kings College London, Apr 1st 2019. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Researchers and industry reflected on on ways of collaborating in the field with particular attention to the challenges around engagement of Research Software Engineers, needed skills and project frameworks. Consolidated relationship between KDL and Living with Machines leading to a second meeting at the Turing with the KDL Director and 2 of their researchers with view of future collaboration.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge and Olivia Vane presented at KQ Codes Technical Socials at University College London: 'Research software engineering at one of the world's largest libraries', 20 February 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The Knowledge Quarter (KQ) Codes Technical Socials at UCL are informal events for anyone with an interest in the computational methods and technology behind research and innovation. They are an opportunity to get to know fellow practitioners, and to discuss and learn about useful tools and techniques which may help with your work.

We gave a presentation on research software engineering at the British Library, including a discussion of RSE roles on Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials
 
Description Mia Ridge initiated a meetup for scholars and institutions working with digitised newspapers for humanities research at Dh2019, Utrecht 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Group established to discuss the challenges and opportunities for scholars and institutions to collaborate using digitised newspaper collections
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge led panel discussion 'The Past, Present and Future of Digital Scholarship with Newspaper Collections' at Digital Humanities 2019 conference, Utrecht, July 10, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL http://www.openobjects.org.uk/2019/07/the-past-present-and-future-of-digital-scholarship-with-newspa...
 
Description Mia Ridge presented 'Living with "Living with Machines": navigating the digital shift at scale' paper accepted for DCDC, Discovering Collections, Discovering Communities, organised by The National Archives and Research Libraries UK (November 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk to cultural heritage audience at TNA about the project.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project Living with Machines to Alberta Comer, Dean and University Librarian, University of Utah (May 2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project at the Library of Congress's Digital Strategy Roundtable, Washington DC (June 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, 'Machine Learning and Digital Humanities' panel, University of Newcastle, Newcastle, September 5, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact As machine learning becomes more common across a wide range of digital solutions, and increasingly factors in our daily lives, it is also being used more frequently in humanities research projects. The possibilities of machine learning need to be understood by humanities researchers and the complexities of the problems investigated in the humanities by those working with machine learning technologies. The humanities can offer a wealth of historical data that presents new challenges to machine learning methodologies: historical records, pictorial representations, literary (or other) text. Recent Digital Humanities projects already employ some machine learning technology, such as with the development of Handwritten Text Recognition (HTR), but the diversification of the data investigated with machine learning approaches has the potential to lead the technology in new and unexpected ways with real-world applications. Panel members include: • Beatrice Alex (University of Edinburgh), • Noura Al-Moubayed (Durham University), • Mia Ridge (British Library), • Melissa Terras (University of Edinburgh).
Year(s) Of Engagement Activity 2019
URL https://n8cir.org.uk/events/machine-learning-and-digital-humanities/
 
Description Mia Ridge, invited presentation, British Library Data Projects workshop, London, August 19, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Consortium of European Research Libraries (CERL) Annual Seminar 2019, Göttingen, October 9, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, KCL / British Library Research Collaboration workshop, Kings College London, September 27, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Museums + AI Network workshop, Pratt Institute, New York, September 16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL https://www.openobjects.org.uk/2019/09/museums-ai-new-york-workshop-notes/
 
Description Mia Ridge, invited presentation, Princeton University Library, Princeton, September 13, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Research Libraries UK International Symposium on Digital Scholarship, London, October 14, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This symposium explored the nature and extent of digital scholarship occurring within research libraries across the international research library community. It brought together representatives from international research library associations, funders, the academic community, and global-library collectives to discuss areas of potential cross-sector and interdisciplinary collaboration, and the routes and networks through which this might be achieved. Mia Ridge presented on "Building capacity for digital scholarship at a research library: Living with Machines, and the impact of data science"
Year(s) Of Engagement Activity 2019
URL https://www.rluk.ac.uk/digital-scholarship-and-the-role-of-the-research-library-symposium-slides/
 
Description Mia Ridge, invited presentation, Wellcome Library, London, July 4, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, presentation, Library of Congress Machine Learning Summit, Washington DC, September 20, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Ridge touched on three main kinds of challenges: scale, operational and interdisciplinary, and
copyright. A larger scale requires new worflows and quickly grows expensive, operationalizing
raises the question of producing public-facing infrastructure, and copyright involves negotiating
complex rights issues.
Year(s) Of Engagement Activity 2019
URL https://labs.loc.gov/static/labs/meta/ML-Event-Summary-Final-2020-02-13.pdf?loclr=blogsig
 
Description Netherlands Film Festival 2020: Generous Interfaces panel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Olivia Vane took part in a panel at the Netherland Film Festival 2020 (run online because of the Covid pandemic) on Generous Interfaces: "In the Generous Interfaces panel we investigate alternative ways to search audiovisual collections, using De Open Beelden Browser ('The Open Images Browser'). How can you enjoy exploring archives even if you're not looking for anything in particular?".

Olivia gave a presentation and then participated in a panel discussion.
Year(s) Of Engagement Activity 2020
URL https://www.filmfestival.nl/en/collection/nff-conferentie-generous-interfaces/
 
Description Olivia Vane, Katherine McDonough, Daniel van Strien, 21st Century Curator Talk (British Library staff talks), Maps and Machines: Using Computer Vision to Analyse the Geography of Industrialization (1780-1920), January 13, 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk onUsing Computer Vision to Analyse the Geography of Industrialization (1780-1920)
Year(s) Of Engagement Activity 2019
 
Description Panel discussion: Expanding and Enriching Metadata through Engagement with Communities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This panel discusses how cultural institutions are engaging various communities to co-create academic research and/or object metadata in order to increase representation and access to collections; highlighting how this is done in different ways to engage specific audiences and goals, i.e. graduate student assistantships, museum interactive experiences, crowdsourcing, and professional action groups.
Year(s) Of Engagement Activity 2021
URL https://mcn2021virtual.sched.com/event/lwrc/expanding-and-enriching-metadata-through-engagement-with...
 
Description Paper submission: Hunting for Treasure: Living with Machines and the British Library Newspaper Collection1 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Abstract: This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to make digitisation choices both transparent and pragmatic. Working towards solutions that reflect collaborations between library staff and scholars, we introduce: a) Press Picker, our custom visualisation tool designed to support decision making about digitisation; and b) the Environmental Scan, a process of automatic metadata generation from the Newspaper Press Directories, a contemporaneous record of British newspapers.
Year(s) Of Engagement Activity 2020
 
Description Podcast interview: Crowdsourcing with Dr Mia Ridge, MadeTech Making Tech Better podcast 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact What is crowdsourcing, and how is it used to improve the British Library's online cultural heritage collections? Clare Sudbery talks to crowdsourcing expert Dr Mia Ridge about the power of volunteer digital engagement.
Year(s) Of Engagement Activity 2021
URL https://www.madetech.com/resources/podcasts/episode-14-mia-ridge-2/
 
Description Poster submission: Data for History in Berlin 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster submission for conference Data for History in Berlin (May 2020) approved.
Due to the covid-19 the conference has been postponed till May 2021.
Year(s) Of Engagement Activity 2020
URL https://d4h2020.sciencesconf.org/
 
Description Presentation 'Historic Census Data and Living with Machines' to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Presentation on 'Historic Census Data and Living with Machines' delivered by Josh Rhodes and Guy Solomon to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021). The presentation gave genealogical professionals, family historians, and other members of the public an insight into how Living with Machines is using historic census data. In particular, we focused on our use of open census data, which is in line with the Free UK Genealogy's mission to provide free, online access to historic British census data. The presentation was delivered to an audience of c. 100 on Zoom, and has since received > 200 views on YouTube. Presenting at this conference enabled the Living with Machines project to establish a closer relationship with Free UK Genealogy, and to begin conversations about sharing data. The presentation also engaged members of the public, who expressed interest in our use of census data, and changed people's minds about what was possible to achieve at scale with historic census data.
Year(s) Of Engagement Activity 2021
URL https://youtu.be/EY7mwn_sHHU?t=716
 
Description Presentation at CogSci seminar at QMUL 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Talk at the Cognitive Science group at Queen Mary University of London, presenting research on "Animate Machines: A study on atypical animacy detection".
Year(s) Of Engagement Activity 2020
URL http://imc.eecs.qmul.ac.uk/wiki/index.php/Abstract_Mariona_Coll_Ardanuy_25_March_2020
 
Description Presentation by Daniel Wilson and Ruth Ahnert at Text Mining Parliamentary Data Seminar, University of Umea, Sweden. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact "Tracing the language of machines across genres: books, journals and newspapers", Academic Presentation by Daniel Wilson and Ruth Ahnert to High Profile International Seminar featuring luminaries of the field such as Mark Algee-Hewit and chaired/respondent by Prof. Jo Guldi. Much interest generated in our method.
Year(s) Of Engagement Activity 2021
URL https://www.umu.se/en/events/comparing-parliaments-novels-and-newspapers_10768814/
 
Description Presentation for the C2DH group in Luxembourg 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation for the C2DH group in Luxembourg. The presentation was part of the "Hands-on History" lectures.
Year(s) Of Engagement Activity 2021
URL https://www.c2dh.uni.lu/events/living-machines-digital-perspectives-industrial-revolution
 
Description Presentation for the History Department at the University of Antwerp 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation on digital methods for history for the History Department at the University of Antwerp.
Year(s) Of Engagement Activity 2021
 
Description Presentation for the Parliamentary Data Seminar (14/10/2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation on the Targeted Sense Disambiguation during the Parliamentary Data Seminar on the topic "What's really going on".
Year(s) Of Engagement Activity 2021
URL https://www.umu.se/en/events/text-mining-parliamentary-data-seminar-what-is-really-going-on-_1084277...
 
Description R. Ahnert, K. Beelen, K. McDonough, D.C.S. Wilson, "Working with Machines," IHSS Seminar at Queen Mary University of London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Presentation about key findings in Living with Machines to the Institute for the Humanities and Social Sciences at Queen Mary University of London.
Year(s) Of Engagement Activity 2021
URL https://www.qmul.ac.uk/ihss/whats-on/replay-/working-with-machines.html
 
Description Research update presentations for British Library staff 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Presentations to British Library staff to share updates on research outcomes in progress. This included:

An overview of digitisation (newspapers, Ordnance Survey maps, etc)
Crowdsourcing, Mia Ridge
Extracting metadata from press directories, Kaspar Beelen and Daniel Wilson
Linking places across sources, Mariona Coll Ardanoy and Katherine McDonough
Visualising map collections, Olivia Vane

Organised by Maja Maricevic and Daniel van Strien
Year(s) Of Engagement Activity 2018,2020
 
Description Rethink Research, Illuminate History webinar, Leeds Digital Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk for the Leeds Digital Festival that explained how the project is using data science and digital history methods to analyse millions of historical documents and understand the impact of mechanisation in the 19th century, and why the British Library is interested in working with data scientists to apply computational methods to historical collections.
Year(s) Of Engagement Activity 2020
URL https://www.youtube.com/watch?v=Po-fw5uWrWM
 
Description Running a remote workshop via a Zoom video call: some quick lessons learnt 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post sharing lessons learnt about running interactive workshops online in the early days of the COVID pandemic.
Year(s) Of Engagement Activity 2020
URL http://livingwithmachines.ac.uk/running-a-remote-workshop-via-a-zoom-video-call-some-quick-lessons-l...
 
Description Ruth Ahnert and David Beavan contribution to the review process of British Academy Digital Research in Humanities scheme 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A workshop here at the British Academy to learn about the progress of seven grants awarded through the Digital Research in Humanities scheme last year. The programme aims to extend support for researchers engaging with Digital Research in the Humanities by offering grants to carry out novel research through the application of new methods and tools to existing digital resources. The use and re-use of existing resources such as digital collections and datasets will demonstrate their capacity to generate new knowledge. The seven award-holders were invited to provide a brief presentation on the aims of their research, what is new about their approach to the topic, problems and challenges arising so far, scholarly benefits identified, the potential for scalability, and plans for the future. Ahnert and Beavan were asked to contribute to this review process which will discuss what commonalities, learning and good practice are emerging from the projects to date. The workshop was also attended by representatives from Jisc who partnered with the British Academy on the scheme and Fellows of the British Academy.
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert presented on the project to the Newseye project, at their project meeting hosted at the British Library (15 November 2018) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Report on Living with Machines to the Newseye project (which works on news, and so has important connections with our work)
Year(s) Of Engagement Activity 2018
 
Description Ruth Ahnert, 'Partial Histories from Partial Archives: Surveying the Land', Think-Play-Hack Summit 2019: Mythologies and World Views, SMU at Taos Campus, New Mexico, 3 July 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Think-Play-Hacks bring together researchers from across the United States and world to contemplate the frontiers of interdisciplinary data science. Imagine a group of humanists, social scientists, data scientists, engineers, literary scholars, and students in the mountains of Taos, New Mexico, moving through big ideas step-by-step, over a series of conceptual explorations, micro-tutorials, and mini-hackathons.

"Think-Play-Hack" Summits incubate interdisciplinary exchange through one day of presentations where we "think" together on critical perspectives from the humanities and social sciences. Next, participants "play" with the fit between critical questions, data, and algorithms through a series of lightning presentations and micro-seminars. Finally, participants "hack" on the concepts and questions they have heard, generating collaborative white-papers or code-driven approaches to the question. Student teams compete for the titles such as "best use of data" or "best visualization."

Think-Play-Hack was founded in 2018 by Mark Algee-Hewitt (Stanford University), Simon DeDeo (Carnegie Mellon University), James Evans (University of Chicago), Jo Guldi (Southern Methodist University), and Andrew Piper (McGill University).
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert, Roundtable contribution on "Critical Data Science: Mapping the Field' at workshop: "Data Science and Digital Cultural Heritage: facilitating new connections between the disciplines and professions that can transform the Global Data Context", 27th June 2019 at UCL. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This workshop was organized by Dr Julianne Nyhan and Dr Tessa Hauswedell and was made possible by the UCL Grand Challenges fund and the UCL Centre for Critical Heritage. In this workshop, we sought to investigate how large-scale digital cultural heritage archives are offering offered new ground to explore the potentials and hazards of data science. We reflected on what might be gained from the building of a 'critical data science' that can better account for the social and cultural contexts that shape digital cultural heritage production and its data-led research. In doing so, we opened a multidisciplinary and trans-professional dialogue on the future of data science and digital cultural heritage.
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert, paper presented: 'Living with Machines', Institute for Applied Data Science Colloquium, QMUL, 20 March 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Project overview talk.
Year(s) Of Engagement Activity 2019
 
Description Semantic Corpus Exploration 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A half-day workshop on Semantic Corpus Exploration organized at the Digital Humanities Conference, Utrecht (2019). Participant learned how to (critically) use DBPedia Spotlight and WideNet for exploring and traversing historical corpora. The workshop was organized by Alex Olieman (University of Amsterdam), Jaap Kamps (University of Amsterdam) and Kaspar Beelen.
Year(s) Of Engagement Activity 2019
URL https://aolieman.github.io/semantic-corpus-exploration/
 
Description Sharing the benefits: free to view newspapers on the British Newspaper Archive 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post announcing that newspaper titles digitised by the project are available for free on the British Newspaper Archive, the result of a long collaboration with FindMyPast and curators at the British Library.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/sharing-the-benefits-free-to-view-newspapers-on-the-british-newspap...
 
Description Spatial History as Digital History Invited Seminar Presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 40 researchers from the University of Leipzig, Collaborative Research Centre 1199: Processes of Spatialization under the Global Condition attended McDonough's presentation on doing spatial history research using digital methods.
Year(s) Of Engagement Activity 2020
URL https://research.uni-leipzig.de/~sfb1199/
 
Description Spatial Humanities 2021: Maps and Machines presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation to Spatial Humanities 2021 conference w/about 30 people in virtual audience.
Year(s) Of Engagement Activity 2021
URL https://dhlab.fcsh.unl.pt/2020/05/21/spatial-humanities-2020-2022/#:~:text=New%20Dates%3A%2015%2D17t...
 
Description Talk '"Stop, collaborate and listen": Lessons on interdisciplinary collaboration from the Living with Machines Project , The Digital Humanities Research Hub at the School of Advanced Study (virtual) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited talk followed by discussion
Year(s) Of Engagement Activity 2021
URL https://www.sas.ac.uk/videos-and-podcasts/digital/Lessons-on-interdisciplinary-collaboration-living-...
 
Description Talk on Digital History in Living with Machines for History Day 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Presented a talk and took part in a panel on digital history for over 270 attendees as part of the Institute for Historical Research's History Day 2020. Questions to the panel were wide-ranging and included copyright, responses to Black Lives Matter, digital participation and access to primary sources. History Day is a 'day of online interactive events for students, researchers & history enthusiasts to explore library, museum, archive and history collections across the UK & beyond.'
Year(s) Of Engagement Activity 2020
URL https://www.history.ac.uk/events/history-day-2020
 
Description Talk: 'Big Data? Orders of Magnitude in Digital Humanities Projects', CUDAN Open Lab Seminar series, Tallinn University, Estonia (virtual) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 40- minute invited paper and discussion
Year(s) Of Engagement Activity 2021
URL https://cudan.tlu.ee/events/2021-03-01-ruth-ahnert-lecture/
 
Description Talk: 'Collaborating virtually/Virtually collaborating', Gerald Aylmer Seminar 2021, The National Archives (virtual) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact A pre-recorded talk, followed by live panel discussion
Year(s) Of Engagement Activity 2021
URL https://www.youtube.com/watch?v=J7Oj8L3tOTU
 
Description Talk: 'Tracing the language of machines across genres: books, journals and newspapers', The Text Mining Parliamentary Data Seminar, Umeå universitet (virtual) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk delivered with Daniel C.S. Wilson, 'Tracing the language of machines across genres: books, journals and newspapers', The Text Mining Parliamentary Data Seminar, Umeå universitet (virtual)
Year(s) Of Engagement Activity 2021
 
Description Talk: , 'RSE & Historical Research Working Together: Living with Machines', DH + RSE Summer School, The Alan Turing Institute (virtual) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk delivered with David Beavan, followed by panel discussion
Year(s) Of Engagement Activity 2021
URL https://github.com/alan-turing-institute/DH-RSE-Summer-School/tree/main/Day%201
 
Description Talk: Collaboration on AI at scale: lessons from Living with Machines, International Council on Archives (ICA) Virtual Conference 'Empowering Knowledge Societies' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact My talk was in the session 'Artificial Intelligence Connects Digital Humanities with Archives Knowledge' at the International Council on Archives (ICA) Virtual Conference 'Empowering Knowledge Societies'. My abstract was:

A partnership between the British Library and Alan Turing Institute with data scientists, curators, research software engineers, computational linguists, digital humanities scholars and historians from those institutions and universities including Exeter, University of East Anglia, Cambridge and Queen Mary University of London, the Living with Machines project is developing data science and AI methods to ask historical questions using digitised collections at scale. Our sources include millions of pages of historical newspapers, novels, maps, census records, directories and other sources. We hope that the research methodologies and tools developed through the project will be adapted and used by cultural heritage professionals and researchers to access and understand digitised historic collections in the future.

This talk will outline why the British Library sought to collaborate with the UK's data science and AI institute. It will share some early methodological results from the project, and reflect on lessons learnt from working with an interdisciplinary team to apply data science methods for research questions in areas as varied as computational linguistics, human computing/crowdsourcing, historical analyses of space and time, data science and software engineering. It will also consider the implications of computational metadata generation or enhancement for existing cataloguing and discovery systems within the Library, and our conscious efforts to share work in progress with staff across the British Library.
Year(s) Of Engagement Activity 2021
URL https://ica.delegateconnect.co/
 
Description Talk: Integrating crowdsourcing into Living with Machines, Crowdsourcing and the Humanities conference, Princeton 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk for a conference organised by the Center for Research Data and Digital Scholarship at University of Pennsylvania Libraries, The Center for Digital Humanities at Princeton University Library, the Princeton Geniza Lab, and the Zooniverse. Mia's presentation was on the panel 'Platforms for People-Powered Research'
Year(s) Of Engagement Activity 2021
URL https://www.kaltura.com/index.php/extwidget/preview/partner_id/1449362/uiconf_id/14292362/entry_id/1...
 
Description The Association for Geographic Information-Scotland: Maps and Machines: Using Computer Vision to Analyse the Geography of Industrialisation (1780-1920) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Invited talk at annual Scottish conference on geographic information.
Year(s) Of Engagement Activity 2021
URL https://www.agi.org.uk/component/civicrm/?task=civicrm/event/info&Itemid=238&reset=1&id=1166
 
Description The impact of OCR on downstream Natural Language Processing tasks (tl;dr) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post summarising a research paper.
Year(s) Of Engagement Activity 2020
URL http://livingwithmachines.ac.uk/the-impact-of-ocr-on-downstream-natural-language-processing-tasks-tl...
 
Description Turing Data Science and Digital Humanities special interest group: uniting all 13 Turing universities plus in GLAM sector. Co-I Barbara McGillivray leading on Manifesto for community building, organised by Co-I David Beavan and Co-I Barbara McGillivray 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Unlocking Historical Maps of Southeast Asia: Collections, Circulations, Publics 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Virtual workshop organized by Yale-NUS on unlocking data from historical map collections. About 50 researchers and students attended and asked questions about the work on 'maps as data' strand of Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://historicmapssea.commons.yale-nus.edu.sg/unlocking/
 
Description Using hack days to explore maps at scale blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A Blog post introducing 'hack days' as a working practice
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/using-hack-days-to-explore-maps-at-scale/
 
Description Using hack days to explore maps at scale: two examples (blog post) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post introducing notebooks on working with maps at scale as part of a 'hack day'.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/using-hack-days-to-explore-maps-at-scale-two-examples
 
Description Working with maps at scale using Computer Vision and Jupyter notebooks 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop at part of an event on Digital Humanities at the National Library of Estonia on 'Working with maps at scale using Computer Vision and Jupyter notebooks'
Year(s) Of Engagement Activity 2020
URL https://www.nlib.ee/en/node/8579#nordplus
 
Description Workshop: from docx to tabular data 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The "Hack and Yack docx2tabs tutorial" workshop showed what a .docx file looks like behind the scenes, and taught how to use Word styles to annotate it in order to feed it into a Jupyter notebook that will transform and output it into tabular data (.csv / spreadsheet).
Year(s) Of Engagement Activity 2021
 
Description information+visualization public talk series: 'Visualising cultural heritage collections: Is the data enough?' lecture 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact In June 2020, Olivia Vane gave a lecture titled 'Visualising cultural heritage collections: Is the data enough?' as part of the information+visualization public talk series organised by Fachhochschule Potsdam - University of Applied Sciences. The lecture was delivered remotely over Zoom because of lockdown restrictions and the videoed talk / Q&A afterwards was published on Youtube. As of November 2020, the Youtube video had been watched more than 600 times.
Year(s) Of Engagement Activity 2020
URL https://www.youtube.com/watch?v=s5QPlhOJQIM&ab_channel=UCLABPotsdam