Living with Machines

Lead Research Organisation: The Alan Turing Institute
Department Name: Research

Abstract

Living with Machines is both a research project, and a bold proposal for a new research paradigm. In this ground-breaking partnership between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL), historians, data scientists, geographers, computational linguists, and curators have been brought together to examine the human impact of industrial revolution.

It is widely recognised that Britain was the birthplace of the world's first industrial revolution, yet there is still much to learn about the human, social, and cultural consequences of this historical moment. Focussing on the long nineteenth century (c.1780-1920), the Living with Machines project aims to harness the combined power of massive digitised archives and computational analytical tools to examine the ways in which technology altered the very fabric of human existence on a hitherto unprecedented scale. The central theme - the mechanisation of work practices - speaks directly to present debates about how society can accommodate the revolutionary consequences of AI and robotics in what has become known as the fourth industrial revolution. To understand the fraught co-existence of human and machine, this project contends that we need research methods that combine technological innovation and human expertise.

The project will utilise the British Library's National Newspaper collection, and event-based records (census, electoral registration, births/ marriages/deaths, trade directories) collected by contributing partners Findmypast. By developing intuitive computational interfaces, and adapting collaborative practices developed in the field of software development, we will enable close interaction between computational methods and historical inquiry.

Outreach and Engagement will be central to the project from the outset, and will take two forms: familiar outcomes such as television programmes and regional exhibitions; and working with individuals and communities to create common understandings of their shared histories. Participatory aspects will embody best practices in crowdsourcing and citizen history.

Project benefits:

1. The UK's first large-scale synergy between data science, artificial intelligence research, and the arts and humanities, building capacity and catalysing new research areas.

2. The development of new computational techniques to marshal the UK's rich archival collections (digitised and born-digital), to enable new research questions to be posed of the holdings.

3. Enriched and interlinked data holdings for the British Library, to add additional context and value to content.

4. The development generalisable tools, code, and infrastructure that can be adapted for and inspire future interdisciplinary research projects.

5. New historical perspectives on the effects of the mechanisation of labour on the lives of ordinary people during the long nineteenth century.

6. The creation of computational models to represent how language and meanings change across time and geography.

7. Research breakthroughs maintaining UK global leadership in Digital Humanities and driving large-scale international partnerships and opportunities.

Planned Impact

Optional.
 
Title British Library Newspapers and Living with Machines 
Description Talk on the British Library's Newspaper collection and the Living with Machines project for Congre`s Me´dias 19 - Numapresse : Presses anciennes et modernes a` l'e`re du nume´rique, La BnF, 3 juin. 2022 
Type Of Art Film/Video/Animation 
Year Produced 2022 
URL https://figshare.com/articles/presentation/British_Library_Newspapers_and_Living_with_Machines/19963...
 
Title Echoing Through Time: New Tunes for Old Words 
Description Leeds-based folk musicians were commissioned to record historical ballads from the British Library's collections that had been researched during the exhibition development process. They recorded the tracks, and subsequently posted them to the British Library's Soundcloud account. 
Type Of Art Composition/Score 
Year Produced 2022 
Impact The musicians are going on to release the recordings on CD, and have performed them live at events for the exhibition. 
URL https://soundcloud.com/the-british-library/sets/echoing-through-time-new-tunes-for-old-words
 
Title Historic machines from 'prams' to 'Parliament': new avenues for collaborative linguistic research 
Description Recording of presentation of long paper, DH Benelux 2022: RE-MIX. Creation and alteration in DH (Hybrid), 1-3 June 2022. Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship. More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution. During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. 
Type Of Art Film/Video/Animation 
Year Produced 2022 
URL https://zenodo.org/record/6583744
 
Title Leeds musicians performing historical ballads from the British Library's collections for the Living with Machines exhibition 
Description Leeds-based folk musicians were commissioned to record historical ballads from the British Library's collections that had been researched during the exhibition development process. They also performed the songs at the exhibition opening, and will perform them at future events related to the exhibition. 
Type Of Art Performance (Music, Dance, Drama, etc) 
Year Produced 2022 
Impact The musicians are packaging the recordings as a CD for sale at gigs etc. 
 
Title Living with Machines: human stories from the industrial age 
Description Living with Machines is the first large-scale exhibition developed in partnership between the British Library and Leeds Museums & Galleries. The exhibition is inspired by the Living with Machines research project. The free exhibition revisits the history of the industrial revolution in Britain through the lens of Leeds and the surrounding regions. It unearths forgotten stories revealing how rapid changes in technology in the nineteenth century changed life and work forever. Contemporary responses, offering reflections on the parallels between mechanisation in the 19th century and advances in AI and digital technology are woven throughout the display. The accompanying events programme includes loom weaving, crafts workshops, a Wiki edit-a-thon, and a special AI series as part of Leeds Digital Festival. The exhibition includes innovative digital interactives built with data crowdsourced through the project. Exhibition captions were written to be accessible to those without any knowledge of the topic, and to a reading age of approximately 10 years old. 
Type Of Art Artistic/Creative Exhibition 
Year Produced 2022 
Impact Exhibition research led to the recording of 19th century ballads by contemporary musicians. The events programme has drawn in a range of audiences, from families with very young children to workers in the tech industry. By the end of September, attendance figures were: For events: 1416 family programme visitors of all ages, and 141 adult event attendees. For the exhibition: 18,258 
URL https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-human-stories...
 
Title Newspaper Infographic Exhibition, British Library 
Description LwM took responsibility for one of the panels in the British Library's (forthcoming) exhibition of nineteenth-century newspaper infographics. In collaboration with the Library's Lead Curator of News, Luke McKernan, and Yann Ryan; Daniel Wilson (History, text) and Mariona Coll Ardanuy (Computational Linguistics, code) conceived of an experimental panel to showcase our research using sentiment analysis on historical newspapers. The panel was made by infographic designer Ciaran Hughes, using datasets provided by the project which focused on emotional responses to industrialisation as seen in newspaper headlines. The exhibition will involve six such panels and will use modern infographic presentational techniques on historical data to tell arresting new stories about nineteenth-century Britain. The exhibition will open on the Lower Ground Floor of the BL in Spring 2021 and will hopefully be seen by large numbers of people and be reported in the press itself. 
Type Of Art Artistic/Creative Exhibition 
Year Produced 2020 
Impact We hope the exhbition will be a corrective to the badly researched uses of historical newspapers to make methodologically unsound claims about the past, and instead showcase a more credible way to apply data science to historical materials, while simultaneously grabbing attention and showcasing the work of the project. 
 
Description The scope and ambition of this project can be summed up in terms of its two foundational objectives. Living with Machines (LwM) sought to understand what computationally-driven historical research is now possible in light of more than two decades of investment in the digitisation of our national cultural assets. And it sought to realise that research potential through a radical experiment in collaboration, bringing together an extremely large and diverse team of researchers and professionals. On both counts, LwM was a huge success.

Firstly, LwM has amassed a vital evidence-base for helping us to understand the affordances of digitised holdings, as well as the barriers for research, within the current cultural heritage data landscape in the UK. The project has provided models for working within that landscape as well as making recommendations for changes to policy in our book Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project (Cambridge University Press, 2023) (CHR). LwM has also created a whole host of new assets which help to make digitised collections research-ready. These assets comprise new datasets (including new digitised content, derived data from existing collections, and databases) and code (including Jupyter notebooks, code accompanying publications, pipelines, and documented software). At the applied level, these combined assets have unlocked a huge number of research opportunities and new historical insights. At the more general level, they represent potential building blocks towards a modularised research infrastructure. In this space we have also seen a number of stand-out successes, including the development of 'MapReader' a computer vision pipeline that has been harnessed successfully not only for the distant viewing and searching of map sheets at scale, but also for the analysis of biological image data - demonstrating how the humanities can develop technical products that have an impact in the sciences. More importantly, we have begun to develop frameworks for building sustainable communities around these assets through training and workshops, which will influence and be extended by one of LwM's follow-on projects.

Secondly, as an experiment in radical interdisciplinary research, the project has also delivered beyond its expectations. This is not to say that everything ran without friction, but rather that we learned huge amounts about one another's fields, about different ways of working and how to work together, and we took time regularly to reflect and decide how to continue in light of what we had learned. As a result we were able to develop our practice and research iteratively, in ways which played to new realities and new strengths. For many members of the team, this experience had been truly transformative: all members will be taking these new skills and experiences forward with them, and we were also able to make concrete recommendations back to our communities in light of this process through our CHR book, our 10-part docu-series, talks, and blog posts,

As a result of the progress made in these two areas, we are now at the point where we can begin to write new histories of the impact of mechanisation on the lives of ordinary people in the long nineteenth century. We are in the process of publishing these through a series of articles and a multi-authored book (under contract), Living with Machines: Computational Histories of the Age of Industry.
Exploitation Route We have created new tools and datasets for a number of different communities which we believe have the power to drive forward research by leaps, rather than increments. Crucially we have sought to ensure that this progress can be carried out beyond our team by making all our code openly available, and by publishing data for reproducibility. However, where many endeavours in this space fall short is with the attitude 'if you build it, they will come'. New methods, tools and datasets are of no use if nobody knows about them. In our final phase of the project we have been focusing on ensuring the legacy of our collaboration by developing user communities around our most important tools and methods through blog posts, workshops, in-person and published tutorials. The specific work of building communities around these assets will be driven forward by the spin-off project 'Building sustainable communities around datasets and software', which began shortly after the end of Living with Machines (LwM) and is led by Pieter Francois (Turing/Oxford), with several members of LwM as Co-Is (Ahnert, Beavan, Nanni, Hobson, and McDonough). The proposal for this project pinpointed the problems of project-based funding for digital research, which often equates to poor return on investment due to the lack of infrastructure to support outputs and their uptake beyond project's end date in terms of hosting, maintenance, or human expertise. The team's proposal extended the blueprint of community development already tested on LwM, proposing that components could be made more generalisable if they were well packaged and documented, and if communities of users and maintainers were actively built around them, the UK could create the basic components of a modular research infrastructure. This project joins the suite of spin-off projects co-developed by team members, including Machines Reading Maps (McDonough), The Congruence Engine (Wilson), Impresso 2 (Beelen), and The Collective Wisdom project (Ridge). Finally, we sought to seed further innovative engagement with our project's assets by funding six 'digital residencies'. These residencies were small fellowships or project awards designed to enable work around one of our datasets or tools. These include data visualisations, a visual poem, a performance piece, a newspaper data processing pipeline, and an online book of tutorials about how to work with newspaper data. The fruits of their labour are reported on our project blog (https://livingwithmachines.ac.uk/latest/) and in reports deposited on the British Library's Research Repository (https://bl.iro.bl.uk/).
Sectors Creative Economy

Digital/Communication/Information Technologies (including Software)

Culture

Heritage

Museums and Collections

URL https://livingwithmachines.ac.uk/
 
Description Thanks to the leadership provided by the British Library (BL), the public engagement work on the project has been first class. The project exhibition was attended by over 42,000 people, and the crowdsourcing work put us in contact with over 5,500 volunteers. As our overview of stakeholder feedback shows, the project has had an impact both on the future of humanities work at the Turing, and on our partners in the cultural heritage sector. The BL reports benefits from their collaboration with Leeds City Museum on the exhibition, from the programme of crowdsourcing, through the enrichment of existing Digital Scholarship Training Programme, as well as in the development of spinoff projects (such as the development of Flyswot). More broadly, the experience of LwM will shape the BL's future AI strategy in important ways. In addition we have been excited to see how effectively we have been able to stimulate the use of our data and code assets through the mechanism of the Digital Residencies. These have not only delivered innovative outcomes, but extended the community gathering around our work into areas such as art and performance. We have also been able to reach a much larger number of people through training and workshops due to the increased effort in that area compared to our plans at the outset. We are also very pleased that our research has also reached the public through the development of our work on OS maps into a story run by The Economist in April 2023. During the months following the official end of the project, we are also releasing episodes from our documentary series via The Alan Turing Institute's Youtube channel (https://www.youtube.com/playlist?list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb) which are designed to make the different research outcomes from our project accessible to the general public.
First Year Of Impact 2023
Sector Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description Advisory Board member (and author of white paper) 'iDAH Research Software Engineering (RSE) Steering Group Working Paper'
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Andre Piza participated in European Commission's "Study on Opportunities and challenges of Artificial Intelligence Technologies for the Cultural and Creative Sectors"
Geographic Reach Europe 
Policy Influence Type Contribution to a national consultation/review
 
Description British Library Research Report 2018-19
Geographic Reach National 
Policy Influence Type Citation in other policy documents
Impact The British Library Research Report features the Living with Machines project accounting for its impact on the British Library's activities. According to the report, "The project has already helped the Library explore the potential and challenges of data science methods, including copyright, the use of cloud-based services at scale, and meshing digitisation and analytical timeframes." *(1) and it is "advancing our [the BL's] capability to undertake computational analysis using very large and heterogeneous digitised sources, and our understanding of types of infrastructure that will enable us to deploy more data-driven research in the future."*(2) *(1) Mia Ridge, British Library's Digital Curator for Western Heritage Collections (and Co-I on Living with Machines) *(2) Maja Maricevic, British Library's Head of Higher Education and Science (and Co-I on Living with Machines)
URL https://www.bl.uk/news/2020/november/publication-of-2018-19-research-report
 
Description Guest lecture and assigned reading: Crowdsourcing at the British Library for UCL's MSc in Data Science for Cultural Heritage
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Guest lecture: Europeana masterclass for Open Digital Cultural Heritage
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Guest lecture: INOS project, overview of citizen science and crowdsourcing
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
URL https://inos-project.eu/2021/07/28/workshop-report-citizen-science-why-get-involved/
 
Description Hands-on workshop 'Planning Crowdsourcing Projects in Cultural Heritage' for Europeana Research and the Europeana Research Community
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
Impact Participants reported improved abilities to undertake crowdsourcing projects. As the workshop was held a few weeks ago, evidence is still being gathered.
 
Description Invited lecture, Crowdsourcing at the British Library
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Participation as a case study in AHRC's Technician Commitment Action Plan
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact RTP case studies will be used for internal and external purposes. In the first instance, they will be shared with colleagues across the organisation to increase everyone's understanding of the term Research Technical Professional (technician) within the context of the Arts and Humanities. This will enable AHRC colleagues to confidently identify members of the RTP community working within their respective schemes and keep members of this community informed of how we are championing the Technician Commitment. Case Studies will help to ensure that decisions made at a strategic and governance level are well informed by the varied experiences of RTPs. In the future we will be keen to publish case studies on external platforms such as our website and communication channels, for example the AHRC newsletter and blog. Most importantly, Case Studies will inform research, dialogue and future activity with regards to AHRC's Technician Commitment Action Plan.
 
Description Project output used in collaborative workshop with Estonian museums
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
Impact Attendees developed skills at the workshop, enhanced by their access to our Open Access Handbook.
URL https://esm.ee/for-visitors/news/the-war-museum-helps-estonian-museums-to-put-crowdsourcing-into-use
 
Description Ruth Ahnert, fed into Forecasting Forum on the Future of Research, at thinktank Demos, December 5th 2019, London.
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
URL https://demos.co.uk/wp-content/uploads/2019/10/Jisc-OCT-2019-2.pdf
 
Description Special lecture 'Things to Know When Planning Crowdsourcing Projects in Cultural Heritage' for Europeana
Geographic Reach Multiple continents/international 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Task & Finish Group for JISC
Geographic Reach National 
Policy Influence Type Contribution to new or improved professional practice
URL https://digitisation.jiscinvolve.org/wp/2023/02/03/is-ai-for-me
 
Description Congruence Engine -- Towards a National Collection Discovery Project -- Secondment for Daniel Wilson to Science Museum Group
Amount £3,000,000 (GBP)
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 11/2021 
End 07/2024
 
Description From crowdsourcing to digitally-enabled participation: the state of the art in collaboration, access, and inclusion for cultural heritage institutions
Amount £64,801 (GBP)
Funding ID AH/T013052/1 
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 02/2020 
End 12/2021
 
Description Machines Reading Maps: Finding and Understanding Text on Maps
Amount £199,529 (GBP)
Funding ID AH/V009400/1 
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 02/2021 
End 10/2022
 
Title A Toponym Resolution Pipeline for Digitised Historical Newspapers 
Description T-Res is an end-to-end pipeline for toponym detection, linking, and resolution on digitised historical newspapers. Given an input text, T-Res identifies the places that are mentioned in it, links them to their corresponding Wikidata IDs, and provides their geographic coordinates. T-Res has been developed to assist researchers explore large collections of digitised historical newspapers, and has been designed to tackle common problems often found when dealing with this type of data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact The tool has already been used for enriching other datasets in the project, and is currently being used in different research experiments for finding places in historical newspapers. 
URL https://github.com/Living-with-machines/T-Res
 
Title Beavan, D., Jackson, M. Plain text and metadata extraction tool 
Description Tool for parallel processing of XML in METS/ALTO format for extraction of plain text and metadata fields, available in XSLT and Python versions. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This data wrangling tool facilitated downstream analysis of historical newspapers focussing on toponym resolution and OCR quality. It forms an essential part of the preprocessing pipeline that will be applied to new datasets whose acquisition is in progress. 
 
Title Beelen, K., Lexicon Expansion Interface 
Description Notebook for exploring word2vec models in order to build a lexicon that can trace certain topics in a collection. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact The Lexicon Expansion Interface allows users to navigate a vector space and expand a list of seed words into a Lexicon. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/lexicon-expansion/language-l...
 
Title Beelen, K., Lexicon Generator, a tool for generating contrastive lexicons using newspaper data 
Description Notebook for building a lexicon by contrasting two corpora using the Fightin' Words algorithm created by Monroe et al, 2008. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This notebook is an implementation of the Monroe et al algorithm "Fightin' Words". It is a feature extraction algorithm that computes which words are most significantly associated with with a specific subcorpus. This notebook helps us to "profile" certain types of language (e.g. contrast conservative to liberal newspapers) 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/lexi...
 
Title Beelen, K., Newspaper metadata database and search interface: scripts to build an ElasticSearch index and explore the data using Kibana 
Description Scripts to build an ElasticSearch index and explore the data using Kibana 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Newspaper metadata database and search interface. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/elast...
 
Title Beelen, K., Pipeline for processing the Newspaper Press Directories 
Description The series of notebooks includes a pipeline for processing the OCR (derived from the scans of Mitchell's Press Directories). The stages include: annotation, preprocessing, automatic tagging and database ingest. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This tool will be crucial for parsing and enriching implicitly structured data (such as the press directories, but also other historical sources). 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/ndp_p...
 
Title Code for Targeted Sense Disambiguation 
Description Code for Targeted Sense Disambiguation and reproducing results published in the http://dx.doi.org/10.18653/v1/2021.findings-acl.243 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact Reproducing results of the paper. Tools for historical sense disambiguation. 
URL https://github.com/Living-with-machines/TargetedSenseDisambiguation
 
Title Coll Ardanuy, M., Hosseini, K., van Strien, D., McDonough, K., Wilson, D., Krause, A., underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources' 
Description Underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'. Resolving Places is one of the first outputs of Living with Machines, a collaborative digital history project at The Alan Turing Institute and the British Library. This research is part of our work to build a nineteenth-century gazetteer that combines place names derived from historical sources (GB1900) with online resources (Wikipedia and Geonames). GB1900 is the result of a crowdsourced project that transcribed all text labels on the 2nd edition 6-inch to 1 mile Ordnance Survey maps of Great Britain (ca. 1900) held by the National Library of Scotland (NLS Maps online). The Living with Machines gazetteer follows best practices in combining multiple existing resources, and is novel in accounting for places that have different scales (e.g. streets, buildings, cities, counties). In the future, we will be adding records and enriching current records with information from OS map 1st edition map label data and other sources. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This work was presented at a workhsop on 27-28 November. Several attendants to the workshop showed interest in using the gazetteer produced through this code. Subsequent completed work and work in progress uses it, within and outside our project. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places/
 
Title Coll-Ardanuy, M., Code that builds a gazetteer from scratch 
Description Code and method to generate a gazetteer from Wikipedia and enriched with Geonames data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Part of larger workflow to create a geographical knowledge base that combines different 19thC knowledge sources together. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/gaze...
 
Title Coll-Ardanuy, M., Hosseini, K., Nanni, F., Toponym Matching 
Description This work looks for potential locations for each toponym identified in text, it addresses issue of high degree of variation in toponyms (due to regional spelling differences, transliterations strategies, cross-language and diachronic variation) and variations due to OCR errors. 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact We have built a flexible deep learning framework for candidate selection through toponym matching, using various state-of-the-art neural network architectures (DeezyMatch). The paper that accompanies this repository assesses the performance of DeezyMatch in different experimental settings. The DeezyMatch repository has had a notable impact, this accompanying repository is used for reference. 
URL https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching
 
Title DeezyMatch Tutorials 
Description The "DeezyMatch_tutorials" Github repository is a collection of tutorials for DeezyMatch (a free, open-source software library written in Python for fuzzy string matching and candidate ranking, developed within the Living with Machines project). In this repository, we collect some tutorials for DeezyMatch, and provide new code for a tutorial offered at the Digital Humanities 2022 conference. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact This repository collects a series of tutorials for DeezyMatch, to other researchers use the tool. The main DeezyMatch repository has 105 stars and 29 forks (as of June, 2023). 
URL https://github.com/Living-with-machines/DeezyMatch_tutorials
 
Title Hobson, T., Tolfo, G. Methodological paper on Living with Machines' metamodel 
Description Data modelling methodology developed to underpin data infrastructure with the aim of promoting interoperability of tools and systems and accessibility of data and derived artefacts within the project and externally. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact The common data model developed by this method has been used in the design of relational database schemas and other research infrastructure to support interoperability across different source data types and varied research activities. 
URL https://www.overleaf.com/read/qjqqfdrqxkpr
 
Title Hosseini, K. and Vane, O. PressPicker code 
Description The PressPicker tool can be used to filter and visualise British Library holdings of undigitised newspapers as a function of time. It is also an interactive tool to pick newspaper titles (e.g. for digitisation). It consists of two Python Jupyter notebooks and a custom JavaScript interactive visualisation. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Successfully made two selections of newspaper titles for digitising within Living with Machines. 
 
Title Hosseini, K., Beelen, K., basic lexicon expansion algorithms using word embeddings 
Description In this notebook, we use the trained word embeddings (using word2vec or fasttext models) to explore the semantic space of our book and sample newspaper datasets. Several basic methods are implemented, e.g. explore the neighbouring words given a seed word (e.g., what are the most similar words to "machine" given our corpus?); visualisation of word vectors using t-SNE. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This work is in progress. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language-lab-mro/lexi...
 
Title Hosseini, K., Nanni, F., Coll-Ardanuy, M., DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching 
Description A free, open-source software library written in Python for fuzzy string matching and candidate ranking. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact String matching is an integral component of many natural language processing (NLP) pipelines. DeezyMatch, a new deep learning approach to fuzzy string matching and candidate ranking, is a free, open-source community software that strives to address advanced string matching and candidate ranking challenges in a more comprehensive and integrated manner than existing tools. DeezyMatch is written in the Python programming language. Thanks to its easy-to-use interfaces, DeezyMatch can be seamlessly integrated into existing entity linking systems. This allows DeezyMatch to be adopted outside the NLP community, especially in Digital Humanities, where it could play a major role in addressing known issues concerning the adoption of entity linking systems due to the non-standard nature of the datasets typically used in this field. DeezyMatch has been the topic of a tutorial and round table (at the LinkedPasts conference 2020) and of an interactive workshop (at the Alan Turing Institute Digital Humanities and Research Software Engineering Summer School, 2021). The GitHub repository has 64 stars and 26 forks. 
URL https://github.com/Living-with-machines/DeezyMatch
 
Title Hosseini, K., exploratory data analysis of GB1900 dataset 
Description A set of Jupyter-notebooks for visualisation and statistical analysis of GB1900 dataset. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These Jupyter-notebooks were developed to explore the GB1900 dataset, including visualisation of various entities (e.g., railway) on a map. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/space-time-mro/gb1900...
 
Title Hosseini, K., exploratory data analysis of newspaper/book databases 
Description A set of Jupyter-notebooks to perform exploratory data analysis on newspaper and book databases. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks were developed as teaching/research tools to: 1) show how to access a remote Postgres DB, query, plot the results. 1) exploratory data analysis (e.g., visualisation and simple statistical analysis) on the data. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/relational_database_e...
 
Title Hosseini, K., from raw data to language-models/word-embeddings 
Description These notebooks combined form a pipeline in which raw book/newspaper textual data can be accessed, preprocessed and then used to generate word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks (and their Python-script version) have been extensively used to generate word2vec, fasttext, Flair and BERT language models. These models are being used in several NLP-related projects. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/noteb...
 
Title Hosseini, K., intrinsic evaluation of word embeddings / language models 
Description The performance of any trained machine learning model needs to be evaluated (intrinsically or extrinsically) before being used. Here, we collected several datasets and developed a set of codes to evaluate trained word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Evaluation of all word-embeddings/language models being used in the project. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Hosseini, K., parallel processing of book (and newspaper) dataset using MPI (Message Passing Interface) 
Description As we are dealing with a large textual data (e.g., our book dataset contains 4.5B words), we started to experiment with different distributed and parallel algorithms to preprocess and to train machine learning models. Here, we used MPI (Message Passing Interface) through Python. This code distributes the job among the requested number of CPUs (workers) which can be on different nodes in a supercomputer (i.e. not limited to shared-memory machines); therefore, it significantly reduces the wall time. This code was tested on Urika. Unfortunately, Urika is not available anymore, and now, we are exclusively using Azure virtual machines (VM). These VMs are shared-memory, so we switched to simpler parallel-processing algorithms. However, the MPI algorithm and tools developed here should be usable later when we have access to even larger datasets. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Preprocess and extract information (e.g., part-of-speech tagging) from large textual datasets. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/mpi_v...
 
Title Hosseini, K., record linkage using various multi-class classifiers and manual annotations 
Description Record linkage across two noisy datasets (for example, historical texts) is a non-trivial task. In this tool, we experimented with different multi-class classifiers, e.g. decision tree and multilayer perceptron architectures. We also assessed the impact of features (e.g., title, date and place of publication) on the statistical performance of these models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Creating a list of linked entities between NPD (newspaper press directory) and British Library titles list. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/linki...
 
Title Hosseini, K., upload images to Zooniverse 
Description ~10,000 images from the digitised newspaper articles were selected and uploaded to Zooniverse for annotation. Defoe, a spark-based toolbox for analysing digital historical textual data, was used to select the images. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact The human/expert annotation is one of the main ingredients in training and evaluating supervised machine learning methods. The results of this experiment can be used in various tasks, e.g., sentence/document classification. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/communities-mro/zooni...
 
Title Living with Machines GitHub Stats report 
Description This repository automatically updates GitHub statistics data for the Living with Machines GitHub Organization and generates a report based on this data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact 38 unique repository views 
URL https://github.com/Living-with-machines/github_stats_report
 
Title Materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School 
Description The dhoxss-text2tech Github repository contains the materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School (used in the 2022 and 2023 editions). This hands-on workshop offers an introduction to programming in python and to natural language processing, from processing texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. The materials are publicly available, and consist of a series of jupyter notebooks, each covering a different topic. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact These materials were used in the 2022 and 2023 editions of the Text to Tech strand of the Digital Humanities Oxford Summer School. In the 2022 edition, 35 students attended the strand. As part of the Summer School feedback survey, one of the participants said that "the notebooks were excellent and I know they will be a resource that I and the other students will keep going back to". 
URL https://github.com/Living-with-machines/dhoxss-text2tech
 
Title Neural Language Models for Historical Research 
Description We have pre-trained four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings (i.e., word2vec and fastText), and four instances considering different time slices for BERT. 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact The repository has had several forks and the language models are already being used by several researchers external to the project. 
URL https://github.com/Living-with-machines/histLM
 
Title Repository for code underlying the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020) 
Description This repository provides underlying code and materials for the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020). 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact This is the code accompanying the paper "Living Machines: A study of atypical animacy" (2020). This paper has already been cited three times in external publications, and the GitHub repository has four external stargazers and one fork. The code in this paper has been used and adapted in a forthcoming publication from the Living with Machines project. 
URL https://github.com/Living-with-machines/AtypicalAnimacy/
 
Title Station to Station: Linking and Enriching Historical British Railway Data 
Description This repository provides underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data', accepted at the Computational Humanities Research conference (2021). It contains the steps to reproduce the experiments reported in the paper and to generate a structured version of the Michael Quick's book "Railway Passenger Stations in Great Britain: a Chronology". 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? Yes  
Impact This repository contains the code used to generate StopsGB (Structured Timeline of Passenger Stations in Great Britain, https://https//doi.org/10.23636/wvva-3d67). This dataset is currently being used in other projects within Living with Machines, and we believe it will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. The code used to generate a gazetteer is already being used in the Machine Reading Maps project. 
URL https://github.com/Living-with-machines/station-to-station
 
Title Vane, O. OS maps metadata visualisation code 
Description Custom visualisation of digitised 19th Century Ordnance Survey maps (from National Library of Scotland) to investigate patterns of map revision through time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Used tool to create supporting material for BL map digitisation proposal and to help identify suitable locations for historical case studies (factors include OS map coverage). 
 
Title Vane, O., Code for filtering Kings Topographical map collection metadata 
Description Python Jupyter notebook for filtering British Library KTop metadata by geography and time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Identifying relevant digitised material for Living with Machines research. 
 
Title Vane, O., Code underlying a blogpost about how to put a D3 JavaScript visualisation in a Python Jupyter notebook. 
Description Jupyter notebook demonstrating how to use JavaScript and the D3 visualisation library in a Python Jupyter notebook. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Email from a blog reader describing it as very helpful. 
URL https://github.com/alan-turing-institute/D3_JS_viz_in_a_Python_Jupyter_notebook
 
Title Vane, O., Strabo output visualisation code 
Description Visualising the output of 'Strabo' tool (software tool to auto-transcribe text in historical maps by researchers at the University of Southern California Spatial Informatics Laboratory). 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Non statistical evaluation of Strabo tool success with our map data. 
 
Title Wiki2Gaz: A series of scripts to create a gazetteer from Wikipedia and Wikidata 
Description This repository contains a series of scripts to create a Wiki-based resources which can be used for different geographic entity linking tasks. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact The resources generated using these scripts are used by the T-Res tool, and have also been used for enriching other datasets in the Living with Machines project. 
URL https://github.com/Living-with-machines/wiki2gaz
 
Title Working with maps at scale using Computer Vision and Jupyter notebooks (Notebook/code) 
Description Notebook showing how to use computer vision/Jupyter Notebooks to support working with image collections at scale. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Materials used at a workshop with ~30 attendees. 
URL https://github.com/Living-with-machines/maps-at-scale-using-computer-vision-and-jupyter-notebooks
 
Title gh_orgstats 
Description gh_orgstats is intended to provide some easy ways of getting stats for a GitHub org. gh_orgstats does this by wrapping some functions around PyGithub. This code is mainly intended to help generate reports as part of a GitHub actions pipeline to update GitHub usage stats for a funder. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact 56 unique GitHub Clones of the repository hosting the code 
URL https://github.com/Living-with-machines/gh_orgstats
 
Title van Strien, D., Beelen, K., Coll Ardanuy, M., Hosseini, K., McGillivray, B., Colavizza, G., underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks' 
Description These notebooks contain the underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'. The code runs experiments reported in the paper and generates the figures used in the paper. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact This code helps the project better understand issues relating to OCR technology and will inform research methods for our projects and other projects working with text produced through OCR. 
URL https://github.com/alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
 
Title van Strien, D., Beelen, K., McDonough, K. 4 Jupyter notebooks on basic computer vision methods for historic OS maps 
Description These notebooks provide an explanation on using computer vision methods with historic maps. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used in two workshops with >40 participants. They will be developed further into a series of tutorials. 
 
Title van Strien, D., Beelen, K., McDonough, K. 5 Jupyter notebooks on using Deep-learning methods for computer vision on historic OS maps 
Description Additional notebooks on using computer vision methods with historic digitised map collections. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used as teaching materials in two workshops and will be developed further into publicly available tutorials. 
 
Title van Strien, D., Prototype Maps annotation pipeline 
Description A prototype method for collecting annotations from researchers, running classification and analysing historic maps at scale. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? No  
Impact These methods have been used as an initial prototype which is currently being developed further inside the project. 
 
Title 19th Century United States Newspaper Advert Classifications 
Description A dataset of images drawn from the Library of Congress Newspaper Navigator Dataset (news-navigator.labs.loc.gov/). The dataset contains images and annotations used for training computer vision models to classify whether an adert is illustrated or not. This is a supplement to a forthcoming programming historian lesson (programminghistorian.org/) but can be used indepently of this lesson. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact The dataset will be made public to coincide with the release of the Programming Historian Tutorials. 
 
Title 19th Century United States Newspaper Advert images with 'illustrated' or 'non illustrated' labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'advertisements. This dataset contains a sample of these images with additional labels indicating if the advert is 'illustrated' or 'not illustrated'. The data is organised as follows: The images themselves can be found in `images.zip` `newspaper-navigator-sample-metadata.csv` contains metadata about each image drawn from the Newspaper Navigator Dataset. `ads.csv` contains the labels for the images as a CSV file `sample.csv` contains additional metadata about the images (based on the newspapers those images came from). This dataset was created for use in an under-review Programming Historian tutorial (http://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt1) The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly. The metadata CSV file contains the following columns: - filepath - pub_date - page_seq_num - edition_seq_num - batch - lccn - box - score - ocr - place_of_publication - geographic_coverage - name - publisher - url - page_url - month - year - iiif_url 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://zenodo.org/record/4075210
 
Title 19th Century United States Newspaper images predicted as Photographs with labels for "human", "animal", "human-structure" and "landscape" 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'photographs'. This dataset contains a sample of these images with additional labels indicating if the photograph has one or more of the following labels: "human", "animal", "human-structure" and "landscape" The data is organised as follows: The images themselves can be found in `images.zip` `newspaper-navigator-sample-metadata.csv` contains metadata about each image drawn from the Newspaper Navigator Dataset. `multi_label.csv` contains the labels for the images as a CSV file `annotations.csv` conains the labels for the images with additional metadata This dataset was created for use in an under-review Programming Historian tutorial (http://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt2) The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly. The metadata CSV file contains the following columns: - filepath - pub_date - page_seq_num - edition_seq_num - batch - lccn - box - score - ocr - place_of_publication - geographic_coverage - name - publisher - url - page_url - month - year - iiif_url 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://zenodo.org/record/4487140
 
Title Alston Herald, and East Cumberland Advertiser 
Description Alston Herald, and East Cumberland Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/31ee3711-05f9-4c94-847d-f922cc12cc36
 
Title Atherstone, Nuneaton, and Warwickshire Times 
Description Atherstone, Nuneaton, and Warwickshire Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/1efd2da4-0289-48cf-ad93-3e63139f22cd
 
Title Barrow Herald and Furness Advertiser 
Description Barrow Herald and Furness Advertiser. (1863 - 1914) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/33b9bd5f-5ea0-4397-883b-cd04b91a7f39
 
Title Birkenhead News 
Description Birkenhead News was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/30830be0-b512-4609-8e3d-be5b7f2b1498
 
Title Blandford Weekly News 
Description Blandord Weekly News was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/7da92592-0a38-443e-8c3c-622284b57ace
 
Title Book and newspaper databases 
Description This database consists of ~49K books (metadata and full-text, 4.5B words) and 11.8M newspaper pages (only metadata). We used "Azure Database for PostgreSQL" service to manage this database.Various codes/jupyter-notebooks are developed to access this database and perform exploratory data analysis. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact This database has been used in various text mining and natural language processing tasks, such as: 1) Generating language models including word2vec, fasttext, Flair and BERT type models. The book database was mainly used here as it has a large number of books suitable for training stable language models; however, we also trained several models using a sample from newspaper articles. 2) Pre-trained models used in "Assessing the Impact of OCR Quality on Downstream NLP Tasks" paper. 3) Developing the processing pipeline. 
 
Title Bridlington and Quay Gazette 
Description Bridlington and Quay Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/111e7722-a223-41af-af60-12b7bfeeb1d9
 
Title Bridport, Beaminster and Lyme Regis telegram 
Description Bridport, Beaminster and Lyme Regis telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/a909ab42-2374-4517-aa2d-67310922e669
 
Title Brighouse & Rastrick Gazette 
Description Brighouse & Rastrick Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/a59056bf-46d6-427a-88fe-63dac27d0707
 
Title British Library Books genre detection model 
Description Model description This model is intended to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'. This model was trained on data created from the Digitised printed books (18th-19th Century) book collection. The datasets in this collection are comprised and derived from 49,455 digitised books (65,227 volumes), mainly from the 19th Century. This dataset is dominated by English language books and includes books in several other languages in much smaller numbers. This model was originally developed for use as part of the Living with Machines project to be able to 'segment' this large dataset of books into different categories based on a 'crude' classification of genre i.e. whether the title was `fiction` or `non-fiction`. The model's training data (discussed more below) primarily consists of 19th Century book titles from the British Library Digitised printed books (18th-19th century) collection. These books have been catalogued according to British Library cataloguing practices. The model is likely to perform worse on any book titles from earlier or later periods. While the model is multilingual, it has training data in non-English book titles; these appear much less frequently. How to use To use this within fastai, first install version 2 of the fastai library. Following the documentation instructions. Once you have fastai installed, you can use the model as follows:
from fastai.text.all import load_learner learn = load_learner("20210928-model.pkl") learn.predict("Oliver Twist")
Limitations and bias The model was developed based on data from the British Library's Digitised printed books (18th-19th Century) collection. This dataset is not representative of books from the period covered with biases towards certain types (travel) and a likely absence of books that were difficult to digitise. The formatting of the British Library books corpus titles may differ from other collections, resulting in worse performance on other collections. It is recommended to evaluate the performance of the model before applying it to your own data. Likely, this model won't perform well for contemporary book titles without further fine-tuning. Training data The training data for this model will be available from the British Libary Research Repository shortly. The training data was created using the Zooniverse platform. British Library cataloguers carried out the majority of the annotations used as training data. More information on the process of creating the training data will be available soon. Training procedure Model training was carried out using the fastai library version 2.5.2. The notebook using for training the model will be available at: https://github.com/Living-with-machines/bl-books-genre-prediction Eval result The model was evaluated on a held out test set:
 precision recall f1-score support Fiction 0.91 0.88 0.90 296 Non-fiction 0.94 0.95 0.95 554 accuracy 0.93 850 macro avg 0.93 0.92 0.92 850 weighted avg 0.93 0.93 0.93 850
 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://zenodo.org/record/5245174
 
Title British Library Books genre detection model 
Description This model is intended to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'. This model was trained on data created from the Digitised printed books (18th-19th Century) book collection. The datasets in this collection are comprised and derived from 49,455 digitised books (65,227 volumes), mainly from the 19th Century. This dataset is dominated by English language books and includes books in several other languages in much smaller numbers. This model was originally developed for use as part of the Living with Machines project to be able to 'segment' this large dataset of books into different categories based on a 'crude' classification of genre i.e. whether the title was `fiction` or `non-fiction`. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact Used as part of a forthcoming living with machines tutorial on genre classification 
URL https://doi.org/10.5281/zenodo.5245175
 
Title British Miner and General Newsman 
Description British Miner and General Newsman was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/cd67adaf-bbaa-498e-9b1f-4a3c71e4ca68
 
Title Central Glamorgan Gazette 
Description Central Glamorgan Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/bef4a88c-ed21-4848-af9a-13c7a4b911a7
 
Title Colne Valley Guardian 
Description Colne Valley Guardian was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/edcbd1fc-e739-4fa9-8a12-ff0f02cc1cb6
 
Title Cotton Factory Times 
Description Cotton Factory Times (1885-1889, 1891-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/cf778baa-be01-4fe1-ae4d-4d3ef3fccf05
 
Title Cradley Heath & Stourbridge Observer 
Description Cradley Heath & Stourbridge Observer. (1864 - 1888) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/63dc3c3d-bbeb-48cf-86e1-cb203f8f0bf8
 
Title Darlington & Richmond Herald 
Description Darlington & Richmond Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/f88c6a06-cfff-43ac-9a69-49469b4e1ea7
 
Title Dataset for Toponym Resolution in Nineteenth-Century English Newspapers 
Description We present a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions of places, which are linked---whenever possible---to their corresponding entry on Wikipedia. The dataset is published on the British Library shared research repository, and is especially of interest to researchers working on improving semantic access to historical newspaper content. We share the 343 annotated files (one file per article) in the WebAnno TSV file format version 3.2, a CoNLL-based file format. We additionally provide a TSV file with metadata at the article level, and the annotation guidelines. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This dataset has already been used by researchers working on the task of named entity recognition in historical digitised newspapers. This dataset will be used in the HIPE 2022 shared task ("Identifying Historical People, Places and other Entities", https://hipe-eval.github.io/HIPE-2022/) organised by the Impresso project, on "Named Entity Recognition and Linking in Multilingual Historical Documents". The dataset will be used by teams from different institutions to develop and assess the performance of state-of-the-art methods in the tasks of named entity recognition and entity linking. This is the second edition of the shared task, 13 teams participated in the first edition of this shared task. 
URL https://bl.iro.bl.uk/concern/datasets/de43a15c-e000-4fec-8b66-7ca94ae13db3
 
Title Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919) 
Description Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
sg = True min_count = 5 window = 5 vector_size = 100 epochs = 5
The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project website (Living with Machines): https://livingwithmachines.ac.uk/ 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact The models are scalable and reusable, so that more research can be carried out with the same output. 
URL https://zenodo.org/record/7887305
 
Title Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919) 
Description Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
sg = True min_count = 5 window = 5 vector_size = 100 epochs = 5
The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project website (Living with Machines): https://livingwithmachines.ac.uk/ 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
URL https://zenodo.org/record/7887304
 
Title Denton and Haughton Examiner 
Description Denton and Haughton Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Variant titles are 1873-74 The Denton, Haughton, & District Weekly News. 1874-75 Denton & Haughton Weekly News, and Audenshaw, Hooley Hill, and Dukinfield Advertiser, 1875-78 Denton Examiner, Audenshaw, Hooley Hill and Dukinfield Advertiser, 1878-92 Denton and Haughton Examiner, etc. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/2a759cdb-6203-438d-8148-30b40bfe734c
 
Title Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England 
Description Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the period 1830-1889. More models are on the way for the following: - each decade in the periods 1780-1829 and 1890-1920 for both North and South England. - diachronic models for the following regions: Scotland, Wales, and Midlands. The models were trained using the following parameters:
sg = True min_count = 1 window = 5 vector_size = 200 epochs = 5
Like the embeddings in this repository, the model for each decade was aligned to the most recent one with Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData. Project website (Living with Machines): https://livingwithmachines.ac.uk/ Data related to: Nilo Pedrazzini & Barbara McGillivray, Diachronic and diatopic word embeddings from British historical newspapers, presented at AIUCD (Convegno dell'Associazione per l'Informatica Umanistica e la Cultura Digitale) in Siena (Italy), June 2023. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact The models are scalable and reusable, so that more research can be carried out with the same output. 
URL https://zenodo.org/record/7892460
 
Title Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England 
Description Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the period 1830-1889. More models are on the way for the following: - each decade in the periods 1780-1829 and 1890-1920 for both North and South England. - diachronic models for the following regions: Scotland, Wales, and Midlands. The models were trained using the following parameters:
sg = True min_count = 1 window = 5 vector_size = 200 epochs = 5
Like the embeddings in this repository, the model for each decade was aligned to the most recent one with Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData. Project website (Living with Machines): https://livingwithmachines.ac.uk/ Data related to: Nilo Pedrazzini & Barbara McGillivray, Diachronic and diatopic word embeddings from British historical newspapers, presented at AIUCD (Convegno dell'Associazione per l'Informatica Umanistica e la Cultura Digitale) in Siena (Italy), June 2023. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
URL https://zenodo.org/record/7892459
 
Title Diachronic word embeddings from 19th-century British newspapers 
Description Word vectors related to the paper Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
sg = True min_count = 1 window = 3 vector_size = 200 epochs = 5
The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact
URL https://zenodo.org/record/7181682
 
Title Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919) 
Description Word vectors related to the paper Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
sg = True min_count = 1 window = 3 vector_size = 200 epochs = 5
The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project webpage (Living with Machines): https://livingwithmachines.ac.uk/ 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://zenodo.org/record/7181681
 
Title Digitised historical newspapers 
Description Newspapers digitised by the British Library for the LwM project, with OCR processing performed by FindMyPast and supplied in a format consistent with the BNA. The dataset comprises ~630 GB of digitised text in METS/ALTO XML format and 435,642 JP2 image files (~6 TB) for 94 newspaper titles. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact Analysis of British historical newspaper content at scale. 
 
Title Dorset County Express and Agricultural Gazette 
Description Dorset County Express and Agricultural Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/7048dfa0-aea5-410f-a6dd-063b74a2c955
 
Title Example computer vision classification training data derived from British Library 19th Century Books Image collection 
Description Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact 85 Downloads of the dataset 
URL https://zenodo.org/record/3689444
 
Title Example computer vision classification training data derived from British Library 19th Century Books Image collection 
Description Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. Currently, included are four folders containing a variety of images derived from the BL books corpus. 'cv_workshop_exercise_data' include images of: 'building', 'people', 'coat of arms' 'humancats' contains images of humans and images of cats The 'fashion' and 'portraits' folders both contain images of people organised into 'female' and 'male'. These labels were annotated by a single annotator and these categories may themselves not be meaningful. They are included in the workshop data as a point of discussion about how we should label data both in general and when working with historical data. This data is intended primarily as an educational resource. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/3667575
 
Title Forest of Dean Examiner 
Description Forest of Dean Examiner (1873-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/bde51795-fcf2-4a78-b232-47bae8b952c4
 
Title Frederick May's London Press Dictionary and Advertiser's Handbook (1883-1911) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price publisher office political and religious leaning 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/536678c9-4c26-41d2-bcbc-5b209ab393b4
 
Title Glasgow Chronicle 
Description Glasgow Chronicle was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/f63cbf7f-0380-4520-86a5-671b555cb274
 
Title Glasgow Courier 
Description Glasgow Courier was a thrice weekly/bi-weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/e4a924fa-608f-443f-88f0-d8b42009c88f
 
Title Halifax Local Opinion 
Description The Halifax Local Opinion was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Th is dataset (BLNewspapers_HalifaxLocalOpinion0003063_1892.zip) is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will be resolved and the dataset will be available by the end of March 2023. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/65f3bbae-e3d4-419b-bf94-c14b57a691c0
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). The Newspaper Navigator dataset consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Used at data for an example notebook showing how to train computer vision models. 59 downloads of the dataset (5/11/2020) 
URL https://zenodo.org/record/4156510
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4156509
 
Title Images from Newspaper Navigator predicted as maps, with human corrected labels 
Description The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4156510
 
Title Kasra Hosseini, language model zoo 
Description Collection of trained word embeddings and language models, mainly by using the book database. Various model types are trained and added to the collection, e.g., word2vec, fasttext, contextual string embeddings (Flair), BERT. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact Language models and word-embeddings are one of the main ingredients in many NLP-related tasks in this project. Here, we keep track of the trained models, so researchers can easily find the models and use them for their research. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Kenilworth Advertiser 
Description Kenilworth Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/5c6af14a-5dba-4f60-b8a3-c919ce0e5ef6
 
Title Lancaster Herald and Town and County Advertiser 
Description Lancaster Herald and Town and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/da8cebfc-e531-443b-b1fe-b4b39d18c302
 
Title Lancaster Standard and County Advertiser 
Description Lancaster Standard and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/f3020250-fc33-4405-b28c-b0194a31e049
 
Title Liverpool Weekly Courier 
Description Liverpool Weekly Courier was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/b8d6e83d-57f8-4ca0-ac9c-5cf88cea48c3
 
Title Living Machines atypical animacy dataset 
Description Atypical animacy detection dataset, based on nineteenth-century sentences in English extracted from an open dataset of nineteenth-century books digitized by the British Library (available via https://doi.org/10.21250/db14, British Library Labs, 2014). This dataset contains 598 sentences containing mentions of machines. Each sentence has been annotated according to the animacy and humanness of the machine in the sentence. This dataset has been created as part of the following paper: Ardanuy, M. C., F. Nanni, K. Beelen, Kasra Hosseini, Ruth Ahnert, J. Lawrence, Katherine McDonough, Giorgia Tolfo, D. C. Wilson and B. McGillivray. "Living Machines: A study of atypical animacy." In Proceedings of the 28th International Conference on Computational Linguistics (COLING2020). 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/work/323177af-6081-4e93-8aaf-7932ca4a390a
 
Title Living with Machines Zooniverse Participant Survey 
Description Summary results from a survey of contributors to Living with Machines Zooniverse crowdsourcing projects. Responses were received between 24 May and 13 June 2022. We designed the survey so that we could align our reporting with two other audience / participant research groups. Firstly, we used the demographic categories that the British Library use in other reporting, allowing us to see Zooniverse volunteers alongside other groups using the British Library's collections. Secondly, we aligned questions about motivations and barriers to participation with the CS Track citizen science research project survey. Our thanks to colleagues on the CS Track https://cstrack.eu/ project for permission to use options from their survey: Lampi, Emilia; Paajanen, Samu; Lämsä, Joni; Hämäläinen, Raija; Hästbacka, Heli; Sabel, Ohto. CSTrack Survey Data 2021. V. 12.8.2021. 10.17011/jyx/dataset/79371 https://jyx.jyu.fi/handle/123456789/79371 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/cb1ce859-a35d-46bd-9362-02c005fb66eb
 
Title Living with Machines alpha and beta Zooniverse 'accident' task data 
Description Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/work/4d262a8a-255b-45a1-a0fe-dc4af48e9798
 
Title Living with Machines alpha and beta Zooniverse 'accident' task data 
Description Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Publishing the data is part of our contract with crowdsourcing participants, and provides evidence of our commitment to transparency and data sharing. 
URL https://doi.org/10.23636/1197
 
Title MapReader_Data_SIGSPATIAL_2022 
Description MapReader in GeoHumanities workshop (SIGSPATIAL 2022): Gold standards and outputs Refer to: https://github.com/Living-with-machines/MapReader/wiki/GeoHumanities-workshop-in-SIGSPATIAL-2022 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://zenodo.org/record/7116800
 
Title MapReader_Data_SIGSPATIAL_2022 
Description MapReader in GeoHumanities workshop (SIGSPATIAL 2022): Gold standards and outputs Refer to: https://github.com/Living-with-machines/MapReader/wiki/GeoHumanities-workshop-in-SIGSPATIAL-2022 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact Republished on National Library of Scotland Data Foundry website: https://data.nls.uk/data/map-spatial-data/living-with-machines-railspace-building/ 
URL https://zenodo.org/record/7147906
 
Title Mariona Coll-Ardanuy - Creation of toponym resolution datasets (ongoing). 
Description Creation of toponym resolution datasets: ~1000 newspaper articles manually annotated with mentions of places and their geographical coordinates. The annotations are not yet complete. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact Ongoing work. We aim at publishing the dataset as soon as the annotations are complete. They will serve to assess the performance of our toponym resolution method and will be a contribution to several fields, like geographic information retrieval, computational linguistics, and digital humanities. 
 
Title Mariona Coll-Ardanuy, Creation of a gazetteer for toponym resolution (ongoing). 
Description Creation of a gazetteer for toponym resolution (alpha version). This is a Wikipedia-based gazetteer, enriched with data from the geographical database Geonames. The alpha version of the code that creates the gazetteer has already been released (see URL below). This work is ongoing: we are working on enriching it with data from historical sources (maps and text). 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The gazetteer has not been made available, but publication and the code repository with the instructions on how to create it are publicly available. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places
 
Title May's British and Irish Press Guide and Advertiser's Handbook & Dictionary etc. (1871-1880) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price, publisher, office, political and religious leaning. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/0a4f3f09-11ff-4360-a73e-ce3a7654f14c
 
Title Midland Examiner and Wolverhampton Times 
Description Midland Examiner and Wolverhampton Times (1874-1878) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/d4cf88ee-0df9-4b38-9f40-9ec1d04e7e56
 
Title Neural Language Models for Nineteenth-Century English 
Description We present four types of neural language models trained on a large historical dataset of books in English, published between 1760 and 1900, and comprised of ˜5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings, and four instances considering different time slices for BERT. Our models have already been used in various downstream tasks where they consistently improved performance. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact Even though word2vec has been around for almost a decade-an eternity in the fast-moving NLP ecosystem-the word type embeddings it produces persist as popular instruments, especially for interdisciplinary research (Azarbonyad et al. 2017; Hengchen, Ros, & Marjanen, 2019). The more recent fastText model extends on word2vec by using subword information. Contextualized language models have meant a breakthrough in NLP research (e.g. Smith (2019) for an overview), as they represent words in the contexts in which they appear, instead of conflating all senses, one of the main criticisms of word type embeddings. The potential of using such models for historical research is immense as they allow a more accurate context-dependent representation of meaning. These embeddings can also be used in existing tools for historical research (e.g. Hosseini, Nanni, and Coll Ardanuy (2020)). Given that existing libraries, such as Gensim, Flair, or Hugging Face, provide convenient interfaces to work with these embeddings, we are confident that our historical models will serve the needs of a wide-variety of scholars, from NLP and data science to the humanities, for different tasks and research purposes, such as measuring how words change meaning over time (Kulkarni, Al-Rfou, Perozzi, & Skiena, 2015; Tahmasebi, Borin, & Jatowt, 2018), automatic OCR correction (Hämäläinen & Hengchen, 2019), interactive query expansion12 or, more generally, any research that involves diachronic language change. 
URL https://zenodo.org/record/4782245
 
Title Neural Language Models for Nineteenth-Century English (dataset; language model zoo) 
Description This dataset contains four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include static (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the two static models, and four instances considering different time slices for BERT. Github repository: https://github.com/Living-with-machines/histLM 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://zenodo.org/record/4779090
 
Title Newspaper Directories digitised, OCRed, modelled and structured data extracted from Mitchell's directories (1846-1909) 
Description This collection includes a subset of Mitchel's Newspaper Press Directories which is annotated and structured for future incorporation in the Newspaper database. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The information extracted from the Press Directories will significantly contribute to enriching newspaper data received from Heritage Made Digital, FindMyPast and JISC. It will also contribute to the environmental scan project and paper. 
 
Title North Cumberland Reformer 
Description North Cumberland Reformer (1890 - 1898) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/1badb649-ad58-416f-a403-13228780c964
 
Title Northern Guardian (Hartlepool) 
Description Northern Guardian (Hartlepool) (1891 - 1902) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/62fccfe5-7ed8-46aa-acd1-47a6b69dd7fb
 
Title Northern Weekly Gazette 
Description Northern Weekly Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://nms.iro.bl.uk/concern/datasets/73abb348-9b50-429e-89a2-d304c1fbcee6
 
Title Nuneaton Times 
Description Nuneaton Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/c0b347e8-1075-4c90-8ab2-b21cab338ef5
 
Title Ordnance Survey Old / First series England and Wales 1:63360 (georeferenced sheet images) 
Description Map sheet images for the Ordnance Survey Old Series / First Series England and Wales 1:63360, georeferenced and cropped at the neatlike (can be viewed together as a seamless composite). Geotiff format. The original (ungeoreferenced) sheet images can be found at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets). The sheets were georeferenced by relating the sheet corners to their coordinates (no internal control points applied), 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact
URL https://bl.iro.bl.uk/concern/datasets/2fa13eb5-1767-469b-b4c0-d9d518bfc1b3#?c=0&m=0&s=0&cv=0&xywh=0%...
 
Title Pontypridd District Herald 
Description Pontypridd District Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/9bc902fc-d174-4149-8395-bbdee46e4309
 
Title Poole Telegram 
Description Poole Telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/cd062f62-e184-4013-bcba-40e959eba4ac
 
Title Potteries Examiner 
Description Potteries Examiner (1871 - 1881) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/2b1174ea-5c32-4acb-a596-254a16f7b54e
 
Title Shropshire Examiner 
Description Shropshire Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://kew.iro.bl.uk/concern/datasets/a52d650d-4a54-40b0-b580-cb19ac5aa744
 
Title South Staffordshire Examiner 
Description South Staffordshire Examiner (1874) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/c500f18c-0d24-4cb4-9c3e-1865d5d89e04
 
Title St. Helens Examiner 
Description St. Helens Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/01932735-62a4-4846-94e1-496d32838f8f
 
Title Stalybridge Examiner 
Description Stalybridge Examiner (1876) which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/8b771dc5-4b6b-44f5-8bda-7ed59a5f875d
 
Title Stockton Herald, South Durham and Cleveland Advertiser 
Description Stockton Herald, South Durham and Cleveland Advertiser. (1858 - 1918) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/c399bc41-c5f6-45c2-b992-38c9d9553cad
 
Title StopsGB: Structured Timeline of Passenger Stations in Great Britain 
Description Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and subsequently online as a PDF created from an underlying MS Word document), this resource was not well suited for systematic linking to other data. We now present a new, automatically generated dataset that provides the rich detail of this exceptional resource in a structured format. Each station described in the _Chronology_ is given certain attributes, such as operating companies and opening and closing dates, and is georeferenced and linked---whenever possible---to its corresponding entry on Wikidata. We name this structured, linked, and georeferenced dataset 'StopsGB' (Structured Timeline of Passenger Stations in Great Britain), and we make it openly available. We believe this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact This is a new contribution. We expect that this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. 
URL https://bl.iro.bl.uk/concern/datasets/0abea1b1-2a43-4422-ba84-39b354c8bb09
 
Title Stretford and Urmston Examiner 
Description Stretford and Urmston Examiner. (1879 - 1880) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/93ecb6b9-f982-4ee6-846a-f2dfbd3a5ce6
 
Title Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' 
Description Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4034819
 
Title Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' 
Description Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL https://zenodo.org/record/4034818
 
Title Supplementary material for 'Station to Station: Linking and Enriching Historical British Railway Data' 
Description Supplementary material for the station-to-station Github repository, containing the underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data', accepted to CHR2021 (Computational Humanities Research). Mariona Coll Ardanuy, Kaspar Beelen, Jon Lawrence, Katherine McDonough, Federico Nanni, Joshua Rhodes, Giorgia Tolfo, and Daniel C.S. Wilson. "Station to Station: Linking and Enriching Historical British Railway Data." In Computational Humanities Research (CHR2021). 2021. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://zenodo.org/record/5520882
 
Title Swansea Journal and South Wales Liberal 
Description Swansea Journal and South Wales Liberal was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/b5e6823b-0b68-4b2b-a1a6-4d3e558cb1eb
 
Title Swansea and Glamorgan Herald, and South Wales Free Press 
Description Swansea and Glamorgan Herald, and South Wales Free Press. (1847 - 1890) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/76b247db-9bb0-44ea-90bd-50770def196a
 
Title Tamworth Miners' Examiner and Working Men's Journal 
Description Tamworth Miners' Examiner and Working Men's Journal (1873 - 1876) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/a71d3774-e548-4a36-9a55-b4e22e0d6761
 
Title The Blackpool Gazette & Herald 
Description The Blackpool Gazette & Herald (1874 - 1919) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. All but one of these datasets is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will be resolved and all the datasets will be available by the end of March 2023. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/e385dfcb-0310-44f5-b84a-c714e6464324
 
Title The Cannock Chase Examiner 
Description The Cannock Chase Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/e5c83b89-dc98-4b08-84a8-ddba4a842f8e
 
Title The Newspaper Press Directory (1846-1880) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in district; title, price, day of publication; politics; date of first issue; political leanings and special interests; proprietors and publishers. Some overseas titles information also included in selected years. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/020c22c4-d1ee-4fca-bf75-0420fe59347a
 
Title The Newspaper Press Directory (1846-1920) - enriched and structured version 
Description Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel Spreadsheet). Each row describes a newspaper at a specific point in time. We record title, politics, price, location and other information. Please consult the data card for a detailed overview of the data structure for more background on the digitisation process. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact - This data underpins previous and future research papers which apply the Environmental Scan method to historical newspaper collection - Also used in many of the digital residences - In general, a useful research for media historians and those looking for information about the 19th century press 
URL https://bl.iro.bl.uk/concern/datasets/adcef12a-bb3d-40d9-871d-5784022a77e8
 
Title The Newspaper Press Directory (1881-1920) 
Description Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in district; title, price, day of publication; politics; date of first issue; political leanings and special interests; proprietors and publishers. Some overseas titles information also included in selected years. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/e943d0e1-59dd-48b5-9a0b-dc7723b30749
 
Title The Runcorn Examiner 
Description The Runcorn Examiner (1870-1954) was a weekly newspaper and years 1870-1920 have been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/f4761b24-54b6-413c-8acb-92cef09866fb
 
Title The Stockton Examiner 
Description The Stockton Examiner (1878-1879) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/ce3731c3-a998-4054-a761-6c68e6c1a626
 
Title Warrington Examiner 
Description Warrington Examiner (1869-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/2e4efac2-e341-467b-86e9-4269ec07c474
 
Title Warwickshire Herald 
Description Warwickshire Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/29a92211-d26e-49d6-9fd0-e6854768cb86
 
Title Weekly Journal 
Description The file consists of the OCR (Optical Character Recognition) text in XML format for one year of Weekly Journal (Hartlepool) 1901. The full digitised newspaper comprises no. 1-407 (29 Nov.1901 - 17 Sep.1909). The digitised page images are available on the British Newspaper Archive website, https://www.britishnewspaperarchive.co.uk/titles/weekly-journal-hartlepool The British Newspaper Archive pages images are behind a paywall, but from March 2021 the paywall will be lifted and some of these images will be free to view. The newspaper continued beyond 1901 but this has not been included in this dataset due to copyright considerations. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/7401d41e-0f67-407d-8f2b-e3f8ba02b7f5
 
Title Weymouth Telegram 
Description Weymouth Telegram (1860 - 1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/85d1ab3b-2902-41ba-91db-4cde128e181a
 
Title Widnes Examiner 
Description Widnes Examiner (1876-1920) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://bl.iro.bl.uk/concern/datasets/027b1e1a-b8bb-4160-99f3-41bca7ba2377
 
Description Collaboration with the Estonian War Museum on a Europeana-funded project 
Organisation Europeana
Country Netherlands 
Sector Public 
PI Contribution I was invited to be a named researcher on a bid by the Estonian War Museum to run a workshop and pilot mini-crowdsourcing projects, funded by Europeana. I contributed to their survey design, and devised and ran 6 structured sessions within a 2 day workshop, designed to take organisations through the processes involved in planning a successful crowdsourcing project. The workshops included prompts for discussion across many departments and disciplines within an organisation, and concluded with group presentations of the ideas developed through the workshops.
Collaborator Contribution The project "Crowdsourcing for military heritage in Estonia" is funded by 9925 Euros on a period of January to June 2022. The Estonian War Museum leads the project, organising the workshops, running the survey and reporting on the results, and monitoring the five projects as they develop to June 2022.
Impact The final outputs will be five small-scale crowdsourcing projects by Estonian museums, a survey, and publications on the lessons the institutions running them learned from the research project.
Start Year 2021
 
Description Digital Residency of Jennifer Hayward and team: Unlocking the Past: Structured Data Extraction in 19th Century Chilean Newspapers 
Organisation Adolfo Ibáñez University
Country Chile 
Sector Academic/University 
PI Contribution Provided funding, data and mentoring
Collaborator Contribution Delivered a project on digitised Chilean newspapers, outlined here: https://livingwithmachines.ac.uk/lwm-digital-residency-unlocking-the-past-structured-data-extraction-in-19th-century-chilean-newspapers/
Impact The project was delivered and will continue to inform future work. Formal report was submitted and will appear in due course on the BL Research Repository (https://bl.iro.bl.uk/)
Start Year 2023
 
Description Digital Residency of Joanne Shepard: Archiving The Railway UK (AR-UK) 
Organisation Durham University
Country United Kingdom 
Sector Academic/University 
PI Contribution We provided funding, data (STOPSGB dataset), and mentoring for this digital residency.
Collaborator Contribution Shepard developed the Archiving The Railway UK (AR-UK) project, reported on here: https://livingwithmachines.ac.uk/lwm-digital-residency-archiving-the-railway-uk-ar-uk/
Impact The official project report will be available on the BL's research repository in due course. https://bl.iro.bl.uk/
Start Year 2023
 
Description Humphrey Southall (Vision of Britain) 
Organisation University of Southampton
Country United Kingdom 
Sector Academic/University 
PI Contribution Reuse of data and citation.
Collaborator Contribution Data sets shared in addition to those available for download on the Vision of Britain site, including a simplified data table.
Impact Data sharing.
Start Year 2019
 
Description Living with Machines and Find My Past 
Organisation Findmypast
Country United Kingdom 
Sector Private 
PI Contribution We will be sharing the methods and outcomes of our research on this data, for example OCR correction, and toponym resolution.
Collaborator Contribution FMP has shared newspaper data with Living with Machines for two counties (Lancashire and Dorset), and in the near future will be sharing all newspapers from Britain dating 1780-1920 that were digitised by FMP for the British Newspaper Archive. A member of FMP also sits on Living with Machines' Advisory Board.
Impact Findmypast has provided samples of the British Library's digitised Newspaper Collection and have advised us through their membership on Living with Machines Advisory Board. There are prospects of working together on OCR correction following the ingestion of other incoming full data-sets from the same collection.
Start Year 2018
 
Description Living with Machines and National Library of Scotland 
Organisation National Library of Scotland
Country United Kingdom 
Sector Academic/University 
PI Contribution Living with Machines initiated contact with Chris Fleet, map curator at the NLS to investigate access to their digitized map collections. K. McDonough and O.Vane have worked closely with Fleet over the last 9 months to share and evaluate the digital map holdings. We organized a workshop (June 2019) at the Turing/BL with Chris and other historical maps experts to explore best practices in working with large collections for a digital humanities project. We have shared back reflections and code for enriching the collection metadata, visualizing the collections, and have also developed a close working relationship that will continue to grow (through the sharing of additional maps and metadata as well as collaborative research into other ways of sharing digital map data to researchers through IIIF).
Collaborator Contribution NLS Maps curator Chris Fleet has shared a subset of the 200,000 digitized sheets, to be expanded on in the near future. He has provided extensive advice and support for working with the metadata, accessing versions of the maps as web map tiles, and thinking about the next steps of using these materials in a computational research environment. He has also been immensely helpful in connecting Living with Machines to the small, but growing community of researchers using machine learning methods with maps.
Impact Blog posts (Computational Approaches to Ordnance Survey Maps: Finding words in maps, part 2: seeing the results blog post); Talks (Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020; Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020, Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019; Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019; Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019); Workshops (Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020; and Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 ); and Meetings (Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019).
Start Year 2019
 
Title Branching sparklines / line graphs 
Description This notebook demonstrates the branching design used in Press Picker: an interactive visualisation tool for newspaper metadata at the British Library, created in the Living with Machines project. Press Picker shows the holdings per-year of different UK newspapers at the library, and their different formats. We used branching to communicate newspapers changing their name. Through history, newspapers sometimes change their name multiple times-particularly local papers. For example, The Athletic Reporter in 1886 becomes The Reporter, which in 1888 becomes The Midland Counties Reporter and General Advertiser, which in 1889 becomes The Reporter and General Advertiser, and so on. In the British Library data, a new name is treated as a wholly separate record. Introducing this branching means we bring together data that, to some extent, is referring to the same thing. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Impact
URL https://observablehq.com/@oliviafvane/branching-sparklines-line-graphs
 
Title DeezyMatch 
Description DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
URL https://zenodo.org/record/3983554
 
Title DeezyMatch 
Description DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
URL https://zenodo.org/record/3983555
 
Title DiachronicEmb-BigHistData 
Description Pipeline to preprocess, train, and align diachronic word embeddings from Big Historical Data, and carry out semantic change tasks on them. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact
URL https://github.com/Living-with-machines/DiachronicEmb-BigHistData
 
Title Living-with-machines/alto2txt 
Description alto2txt: Extract plain text from newspapers Converts XML (in METS 1.8/ ALTO 1.4, METS 1.3/ ALTO 1.4, BLN or UKP format) publications to plaintext articles and generates minimal metadata. Full documentation and demo instructions. Added Added PyPI version and MIT license badges to README.md Added pytest-cov with default options to assess documentation Added isort to .pre-commit-config.yaml to sort import consistency Added pycln to .pre-commit-config.yaml to check unused imports Added pycln configuration to pyproject.toml Added alto2txt as a command line script in pyproject.toml Changed Switch from Apache v2.0 license to MIT license, inline with project recommendations. Updated mypy in .pre-commit-config.yaml Deprecated Replace extract_publications_text.py with the alto2txt command line interface script specified in pyproject.toml Removed setup.py requirements.txt Fixed Fixed python = ">3.6.0" in pyproject.toml rather than >3.7 for consistency with documentation Fixed licensing ambiguity (now all should be MIT) Fixed typos in README.md Fixed surperflous imports via pycln in pre-commit 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
URL https://zenodo.org/record/7378349
 
Title Living-with-machines/alto2txt 
Description alto2txt: Extract plain text from newspapers Converts XML (in METS 1.8/ ALTO 1.4, METS 1.3/ ALTO 1.4, BLN or UKP format) publications to plaintext articles and generates minimal metadata. Full documentation and demo instructions. Added Added PyPI version and MIT license badges to README.md Added pytest-cov with default options to assess documentation Added isort to .pre-commit-config.yaml to sort import consistency Added pycln to .pre-commit-config.yaml to check unused imports Added pycln configuration to pyproject.toml Added alto2txt as a command line script in pyproject.toml Changed Switch from Apache v2.0 license to MIT license, inline with project recommendations. Updated mypy in .pre-commit-config.yaml Deprecated Replace extract_publications_text.py with the alto2txt command line interface script specified in pyproject.toml Removed setup.py requirements.txt Fixed Fixed python = ">3.6.0" in pyproject.toml rather than >3.7 for consistency with documentation Fixed licensing ambiguity (now all should be MIT) Fixed typos in README.md Fixed surperflous imports via pycln in pre-commit 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Software supporting the digital research infrastructure for working with digitised records such as historic newspapers. Of use to GLAM sector institutions and researchers alike 
URL https://zenodo.org/record/7378350
 
Title Living-with-machines/hmd_newspaper_dl: Initial release 
Description This release is for a version of the code which works with the current version of the British Library Research Repository What's Changed update code to support new BL repo by @davanstrien in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/4 Bump addressable from 2.7.0 to 2.8.0 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Bump rexml from 3.2.4 to 3.2.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/7 Bump nokogiri from 1.11.0 to 1.12.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/6 New Contributors @dependabot made their first contribution in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Full Changelog: https://github.com/Living-with-machines/hmd_newspaper_dl/compare/v0.0.1...v0.0.2 
Type Of Technology Software 
Year Produced 2021 
URL https://zenodo.org/record/5571790
 
Title Living-with-machines/hmd_newspaper_dl: Initial release 
Description This release is for a version of the code which works with the current version of the British Library Research Repository What's Changed update code to support new BL repo by @davanstrien in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/4 Bump addressable from 2.7.0 to 2.8.0 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Bump rexml from 3.2.4 to 3.2.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/7 Bump nokogiri from 1.11.0 to 1.12.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/6 New Contributors @dependabot made their first contribution in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Full Changelog: https://github.com/Living-with-machines/hmd_newspaper_dl/compare/v0.0.1...v0.0.2 
Type Of Technology Software 
Year Produced 2021 
Impact Code for bulk downloading newspaper datasets 
URL https://zenodo.org/record/5571839
 
Title Living-with-machines/image-search: workshop materials 
Description What's Changed workshop materials for hack and yack created by @davanstrien in https://github.com/Living-with-machines/image-search/pull/1 
Type Of Technology Software 
Year Produced 2022 
URL https://zenodo.org/record/6473464
 
Title Living-with-machines/nnanno: 0.0.2 
Description This release adds nnanno to PyPI 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
URL https://zenodo.org/record/5537184
 
Title MapReader 
Description MapReader is a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Further applications in a new domain: the Turing project Scivision is applying MapReader in a plant phenotyping task. 
URL https://github.com/Living-with-machines/MapReader
 
Title Notebook: Prepare Zooniverse Data for Analysis and Deposit 
Description This Jupyter Notebook, written in Python, combines Zooniverse classification and subject files into a single CSV with redacted usernames and identifying information. It can be opened directly in Colab from the page. It is a November 2023 update to the original tutorial to support a release of Zooniverse data. Part of a collection of Jupyter Notebooks for processing Zooniverse classification and subject files created for the British Library's Digital Scholarship Training Programme by the Living with Machines project's British Library team. 
Type Of Technology Software 
Year Produced 2023 
Open Source License? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.10392953
 
Title Observable notebook 'Heatmap for polygons' 
Description JavaScript Observable code notebook demonstrating a geospatial visualisation technique: "Visualise overlaps in a large polygon dataset: colourise-alpha using WebGL shaders + PIXI.js". The code notebook demonstrates the technique on historical maps data from National Library of Scotland. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Thanks from National Library of Scotland, whose data it is demonstrated on, who described the code as "really interesting and useful" for them. 
URL https://observablehq.com/@oliviafvane/heatmap-for-polygons
 
Title Press Picker: An interactive visualisation tool for newspaper metadata 
Description Press Picker was created to help select British Library newspaper titles for digitisation. Read more about the context in this blog post and see an interactive demo in this post. The tool provides an overview of newspaper holdings over time, their different formats (hardcopy or microfilm), and the relationship between titles connected by name changes. Titles can be selected within the interface and their data exported. We are sharing the code for reuse. Press Picker consists of two Python Jupyter notebooks. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Inquiry about reuse from Berlin State Library (Staatsbibliothek zu Berlin) 
URL https://github.com/Living-with-machines/PressPicker_public
 
Title alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo 
Description Small version bump with updated linguistic processing notebooks. 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/3610375
 
Title alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo 
Description Small version bump with updated linguistic processing notebooks. 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/3611200
 
Title davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 
Description An introduction to computer vision for working with maps: workshop at DHN 2020 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/4106323
 
Title davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 
Description An introduction to computer vision for working with maps: workshop at DHN 2020 
Type Of Technology Software 
Year Produced 2020 
URL https://zenodo.org/record/4106322
 
Title deduplify - author Sarah Gibson 
Description deduplify is a Python command line tool that will search a directory tree for duplicated files and optionally remove them. It generates an MD5 hash for each file recursively under a target directory and identifies the filepaths that generate unique and duplicated hashes. When deleting duplicated files, it deletes those deepest in the directory tree first leaving the last present. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact The deduplify tool enables the deduplication of file records in messy datasets and has been used within the process of wrangling the JISC1 & JISC2 newspaper datasets into a form amenable to further processing. 
URL https://github.com/Living-with-machines/deduplify
 
Title defoe, the spark-based for analysing historical datasets 
Description This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations. The proposed tool has been successfully evaluated using five different large-scale historical text datasets and two HPC environments, as well as on desktops. Results shows that defoe allows researchers to query multiple datasets in parallel from a single command-line interface and in a consistent way, without any HPC environment-specific requirements. 
Type Of Technology Software 
Year Produced 2019 
Impact Originally developed by UCL and the British Library (funded by Jisc, 2015) then UCL (funded by 2016-2018), defoe was refactored and extended by EPCC, The University of Edinburgh for both Alan Turing Institute funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020) 
URL https://github.com/alan-turing-institute/defoe
 
Title defoe_visualization, a collection of notebooks for analysing further the results obtained by defoe 
Description defoe_visualization is a repository of Jupyter notebooks which complements the defoe scalable and portable digital eScience toolbox for historical research. These notebooks allow researchers to explore query results from defoe and to post-process the results to reveal new insights into the historical data processed by defoe. The notebooks are complemented with sample data files with the query results produced by the authors. 
Type Of Technology Software 
Year Produced 2019 
Impact Developed by EPCC, The University of Edinburgh in conjunction with: the Alan Turing Institute (2018-2019) funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020). 
URL https://github.com/alan-turing-institute/defoe_visualization
 
Title flyswot 
Description flyswot is a Command Line Tool that supports British Library staff in processing 'legacy' digitised content using computer vision. Flyswot is a command-line tool that can be run across images in a directory to check for incorrect metadata. Flyswot has the following features UNIX style search patterns for matching images to predict against produces a CSV output containing the paths to the input images, the predicted label and the models confidence for that prediction. produces a summary 'report' providing a high-level summary of the predictions made by flyswot automatically downloads the latest available flyswot model 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact The British Library holds a large amount of 'legacy' digitised material (~1 Petabyte). Some of these images have previously assigned uncorrected metadata as the result of limitations in a legacy digitised image platform. In particular images of manuscript pages were given the label 'flysheet' when other available labels weren't available. As a result, many images are falsely labelled as 'flysheets'. As part of the move to a new digital library system, there is a desire to correct this metadata. The scale of this problem makes fully manual intervention challenging. Flyswot, and the associated machine learning models, were developed in collaboration with the Heritage Made Digital team within the library to support library staff in processing this material. Flyswot is actively being used in this workflow and is helping speed up the process of checking images and helping assess the required work in processing collections. Beyond this, flyswot has also identified collection items that didn't have pagination and as a result curators have intervened not only in digital collections but also with the physical items. 
URL https://github.com/davanstrien/flyswot
 
Title jisc-wrangler - author Timothy Hobson 
Description jisc-wrangler is a Python tool written specifically to restructure and deduplicate XML files containing OCR content from the JISC 1 & JISC 2 newspaper dataset. It outputs a canonical file structure and filename convention amenable to further processing with the alto2txt tool. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This tool makes the JISC 1 & JISC 2 newspaper datasets accessible to the research project by cleaning, deduplicating and standardising the directory structure and filenames. It performs an essential pre-processing step that unlocks the potential of this open-access dataset. 
URL https://github.com/Living-with-machines/jisc-wrangler
 
Title subsamplr - author Timothy Hobson 
Description subsamplr is a Python tool for representative subsampling from a population. It was designed for sampling from a large collection of digital newspapers, but is a generic tool that could be applied in any context in which metadata is available for a population and a subsample is desired. Any features in the metadata can be used as dimensions for subsampling. The tool is configurable to connect to a metadata database and includes example Jupyter notebooks. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Many avenues of research within the Living with Machines project target historic newspaper data, and given the volume of data available, the first step is typically to sample from the various newspaper collections to produce an accessible subset of data on which research methodologies can be developed and tested. The subsamplr tool is designed for precisely this purpose and is therefore an important component in the research workflow across a wide variety of investigations in the project. It enables researchers to specify subsampling parameters from which data samples are (reproducibly) generated that satisfy the requirements of the particular research question at hand. 
URL https://github.com/Living-with-machines/subsamplr
 
Description "How we collaborate" blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blog post series reflecting on our experience of collaborating on the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/how-we-collaborate/
 
Description "Introducing the Language Lab" blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blogpost introducing the language lab, which explored the social and cultural impact of the Industrial Revolution as reported in newspapers and other types of textual sources.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-language-lab/
 
Description "Introducing..." blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact We published a series of blog posts introducing each member of the Living with Machines team
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/the-team/
 
Description 'Data visualisation for cultural heritage collections' course at N8 Centre of Excellence in Computationally Intensive Research 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact Olivia Vane delivered a two-part workshop on data visualisation for Digital Humanities. Split over two sessions, the workshops gave an overview of the key concepts in data visualisation, before moving to tackle more practical exercises in the second week.
Year(s) Of Engagement Activity 2021
URL https://n8cir.org.uk/events/data-visualisation-hums/
 
Description 'HISTORICAL RESEARCH IN THE DIGITAL AGE', PART 4: 'RESEARCHING WITH BIG DATA; AND HOW HISTORIANS CAN WORK COLLABORATIVELY' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Ruth Ahnert, Professor of Literary History & Digital Humanities at QMUL, considers in this blogpost how historians can work with big data, with reference to the need for and approaches to interdisciplinary collaboration. Ruth draws on her experience of leading Living with Machines, an interdisciplinary project bringing together historians and data scientists, and based at the British Library and Alan Turing Institute. Ruth and fellow researchers describe the project - and the opportunities and challenges of interdisciplinary working - in their new book, Collaborative Historical Research in the Age of Big Data, published by Cambridge University Press and freely available Open Access.
Year(s) Of Engagement Activity 2023
URL https://blog.royalhistsoc.org/2023/02/07/historical-research-in-the-digital-age-part-4/
 
Description 124 Introduction to OCR and HTR 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Daniel van Strien presented as part of a British Library staff training workshop on OCR (Optical Character Reccongition)
Year(s) Of Engagement Activity 2020
 
Description A talk or presentation - D. Bevan "Building ethical frameworks to balance risk and innovation" invited panel at Research Panel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk at panel as part of two day workshop
Year(s) Of Engagement Activity 2020
URL https://ei4ai.wordpress.com/workshop/
 
Description ACH talk: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A talk for the Association for Computers and the Humanities ACH2021 conference in July 2021, presented by Mia and based on her work with Barbara McGillivray, Giorgia Tolfo, Emma Griffin and others in the project.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati...
 
Description AI and ethics panel discussion for Leeds Digital Festival at Leeds City Museum 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact AI is coming, so how do we live and work with it? What can we all do to develop ethical approaches to AI to help ensure a more equal and just society?

Co-Investigators Maja Maricevic and Mia Ridge organised a public panel discussion to address questions around the ethics of AI, building on the issues explored in the Living with Machines exhibition. Hosted by Leeds City Museum and timed for inclusion in the Leeds Digital Festival, the event was held on September 29, 2022.

The event blurb was:
What can we all do to develop ethical approaches to AI to help ensure a more equal and just society?

AI is all around us - it's in our phones, our social networks, job and credit applications and more. AI is increasingly used to make complex judgements, and process information in creative ways that previously seemed unique to humans. But what about areas that require empathy and emotional intelligence?

Some uses of AI have very serious implications and require our full attention - especially when it comes to making decisions that can affect our values.

AI advances will continue to disrupt our lives. How will we live and work with machines in the future? What should our relationship with AI look like? How do we ensure that AI systems work for all humans, and not just those implementing them?

Join us from 5:30pm for an exciting and thought-provoking conversation with our expert panel on the ethics of AI.

Hear from our panel of experts including:

Chair - Timandra Harkness
Sherin Mathew - Founder & CEO of AI Tech UK
Robbie Stamp - CEO at Bioss International and author
Keely Crockett - Professor in Computational Intelligence, Manchester Metropolitan University
Andrew Dyson, Global Co-Chair of DLA Piper's Data Protection, Privacy and Security Group

You'll have a chance to ask questions in the Q&A, then mingle with other attendees over drinks.

This panel and related workshop are organised by the British Library and Alan Turing Institutes' Living with Machines project, in partnership with Ai Tech North UK.
Year(s) Of Engagement Activity 2022
URL https://blogs.bl.uk/digital-scholarship/2022/09/learn-more-about-living-with-machines-at-our-events....
 
Description AI and the creative industries panel discussion for Leeds Digital Festival at Leeds City Museum 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact How will AI change what we wear, the TV and films we watch, what we read?

Co-Investigators Maja Maricevic and Mia Ridge organised a public panel discussion to address questions such as those posed in a related blog post: 'some uses of AI have very serious implications and require our full attention - especially when it comes to designing and using AI in ways that integrate a human-centred set of values. How we entertain ourselves with the aid of technology and AI has been on our radars for a while. But what about cultural and linguistic diversity, equality and equity of opportunity, fairness, enrichment of our life experiences, emotional growth and better mental health, stronger connection and understanding of others, the opportunity to learn and acquire new knowledge, delight in beauty? Can we envisage AI being part of these things?'

Hosted by Leeds City Museum and timed for inclusion in the Leeds Digital Festival, the event was held on September 22, 2022.

Event blurb:

Join us in person to hear from a panel of experts about the use of AI in fashion, beauty, broadcasting, arts, and heritage, and how this might impact our lives, and possibly our identity in the years to come.

There's an exponential rise of AI in our everyday lives - from the use of our data by social media, to the algorithms working out what we buy, how we vote and what we eat. AI is also increasingly underpinning the cultural and creative sphere of our lives. This creates new and exciting opportunities, but also brings new challenges.

Join us from 5:30pm for an exciting and thought-provoking conversation with our expert panel.

Our amazing panellists include:

Chair: Zillah Watson, independent consultant, ex-BBC
Rebecca O'Higgins - Founder KI-AH-NA
Laura Ellis, Head of Technology Forecasting, BBC
Maja Maricevic, Head of Higher Education and Science, British Library - libraries and heritage

You'll have a chance to ask questions in the Q&A, then mingle with other attendees over drinks.

The panel is organised by the British Library and Alan Turing Institutes' Living with Machines project, in partnership with Ai Tech North UK.
Year(s) Of Engagement Activity 2022
URL https://livingwithmachines.ac.uk/the-role-of-ai-in-creative-and-cultural-industries/
 
Description AI4LAM presentation: AI training resources for GLAM 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation introducing an "AI training resources for GLAM review" document. The presentation took place as part of an AI4LAM community call (https://sites.google.com/view/ai4lam)
Year(s) Of Engagement Activity 2021
URL https://docs.google.com/document/d/1l4KFhAX1nijBUmE5Srfcq2ELFvrYbm8fp3jaszsmiAE/edit?usp=sharing
 
Description An introduction to computer vision for working with digitised heritage collections (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A workshop with around ~50 participants introducing deep learning-based computer vision methods to digital humanities researchers and heritage professionals.
Year(s) Of Engagement Activity 2020
URL https://github.com/Living-with-machines/computer-vision-DHNordic-2020-workshop
 
Description Andre Piza presented at "Future of Journalism" to Open Society Foundation Journalism Programme 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation about Living with Machines project started dialogue with BBC News Labs and Open Society leading to talk from BBC News Labs Executive Product Manager (David CAswell) at the Alan Turing Institute and visit from Open Society's Independent Journalism Senior Programme Specialist (Shuwei Fang). Opportunities for collaboration with LWM are now being explored with BBC News Labs.
Year(s) Of Engagement Activity 2019
 
Description Annotation session with the British Library staff, 2 August 2019, organised by Daniel van Strien, Mariona Coll Ardanuy, and Mia Ridge 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We had an open annotation session in which we invited British Library staff members to help with our experiments. We planned four different linguistic annotation tasks (named entity recognition, recognition of machines, entity linking to Wikipedia, and semantic role labeling) on newspaper articles from the nineteenth century.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Article on History First 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Co-I Mia Ridge was interviewed by journalist Mark Bridge for a piece that was posted in September 2022, 'Tools from £9.2m Industrial Revolution project will uncover hidden stories'. The story featured project work with computer vision and maps, crowdsourcing and 'rail space', and mentioned the project's GitHub repository, website and exhibition. It concluded with a focus on making our work relevant and accessible to community historians and the GLAM sector.
Year(s) Of Engagement Activity 2022
URL https://historyfirst.com/tools-from-9-2m-industrial-revolution-project-will-uncover-hidden-stories/
 
Description Association for Computers and the Humanities paper presentation: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A presentation for the annual Association for Computers and the Humanities conference that explicitly addressed the challenges of embedding crowdsourcing as a form of public engagement into a 'data science' research project with different conceptions of timelines, metrics for success, etc.
Year(s) Of Engagement Activity 2021
URL https://ach2021.ach.org/
 
Description Beta Test of Library Carpentry Introduction to AI and Machine Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop hosted by LIBER/BNF. Daniel van Strien contributed towards a beta test of a lesson that is currently in the early stages of development and is to become a part of the Library Carpentry Curriculum.
Year(s) Of Engagement Activity 2021
URL https://libereurope.eu/mec-events/beta-test-of-library-carpentry-introduction-to-ai-and-machine-lear...
 
Description Blog Post 'Heatmap for polygons: visualise overlaps in a large polygon dataset' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact A technical 'how to' blog post on the Living with Machines website, describing a geospatial visualisation technique. National Library of Scotland (whose data the post demonstrates the technique on) and Registers of Scotland both fed back that the blog post was helpful and interesting.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/heatmap-for-polygons-visualise-overlaps-in-a-large-polygon-dataset/
 
Description Blog Post 'Press Picker code published' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact "We are very pleased to share the code for 'Press Picker', our interactive data visualisation tool for newspaper metadata: https://github.com/Living-with-machines/PressPicker_public."
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/press-picker-code-published/
 
Description Blog Post on Sources Lab (Understanding the Victorian Newspaper Landscape) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Blog post describing the work of the Source Lab on Digitizing and processing the Newspaper Press Directories.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/sources-understanding-the-victorian-newspaper-landscape/
 
Description Blog post 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Blog post on the Living with Machines website: 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings'.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title-name-changes-in-the-brit...
 
Description Blog post: "Finding your way among newspapers" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Blog post "Finding your way among newspapers" on how to select newspapers for digitisation at the British Library.
Year(s) Of Engagement Activity 2020
URL http://livingwithmachines.ac.uk/finding-your-way-among-newspapers/
 
Description Blog post: 'Platforms for People-Powered Research' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post highlighting contributions to a conference and sharing a video from the panel discussion.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/platforms-for-people-powered-research/
 
Description Blog post: Ad or not? New crowdsourcing task 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post describing a new crowdsourcing task that aimed to make data from a previous task easier to analyse by classifying articles as being advertisements or not.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/ad-or-not-new-crowdsourcing-task/
 
Description Blog post: Bridging humanities: embedding public participation in a collaborative research project 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post highlighting our contribution to a panel at the Association for Computing in the Humanities conference.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati...
 
Description Blog post: Exploring ideas for our Living with Machines exhibition 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post setting out exhibition themes and introducing our collaboration with Leeds Museums and Galleries.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/exploring-ideas-for-our-living-with-machines-exhibition/
 
Description Blog post: First crowdsourced datasets available 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A post in support of the first open data release from crowdsourcing activities on the project, linking to the British Library's research repository.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/first-crowdsourced-datasets-available/
 
Description Blog post: From prams to Parliament - what was a machine? Help us find out 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/from-prams-to-parliament-what-was-a-machine-help-us-find-out/
 
Description Blog post: Highlights from crowdsourcing projects at the British Library 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The post provided progress reports on a range of crowdsourcing projects at the British Library, including the Zooniverse tasks created through Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://blogs.bl.uk/digital-scholarship/2020/12/highlights-from-crowdsourcing-projects-at-the-britis...
 
Description Blog post: Learning from Zooniverse volunteers to improve crowdsourcing projects 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post that describes how feedback from volunteers led to improvements in our crowdsourcing task launched in December 2020.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/learning-from-zooniverse-volunteers-to-improve-crowdsourcing-projec...
 
Description Blog post: Sharing our Delivery Plan 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The post celebrated the deposition of our 2019 Delivery Plan in the British Library's repository. Sharing it was part of our commitment to transparency, and to sharing our lessons learnt as we ourselves learn them.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/sharing-our-delivery-plan/
 
Description Blog post: The role of AI in creative and cultural industries 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post by Co-Investigator Maja Maricevic in support of our events programme in Leeds in September 2022. The post provided background for the events and set out some of the questions our panellists were set to explore.
Year(s) Of Engagement Activity 2022
URL https://livingwithmachines.ac.uk/the-role-of-ai-in-creative-and-cultural-industries/
 
Description Blog post: What does a 'digital humanities research software engineer' do? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A conversation between Mia and Olivia Vane designed to broaden the reach and demonstrate the range of experience, skills and job titles relevant to our job advertisement replacing Olivia as DH RSE. When we interviewed for the post, we learnt that this post was pivotal in the successful applicant deciding to apply for the role.
Year(s) Of Engagement Activity 2021
URL https://livingwithmachines.ac.uk/what-does-a-digital-humanities-research-software-engineer-do/
 
Description Blog post: What is a 'machine' anyway? Help us describe them 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/what-is-a-machine-anyway-help-us-find-out/
 
Description British Library Open House Session at Boston Spa 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Open House Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Show and Tell Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact An interactive poster session about the various tasks and outcomes of the Projects Labs, attended by staff across the British Library and Alan Turing Institute.
Year(s) Of Engagement Activity 2019
 
Description British Library project web page 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Created a project web page on the British Library website to provide official visible information about the project in support of our other engagement activities.
Year(s) Of Engagement Activity 2021
URL https://www.bl.uk/projects/collective-wisdom
 
Description Cambridge GLAM Digital champions lightning talk "The Living with machines project" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact I presented the Living with machines project to an audience of librarians and other professionals from the GLAM (Galleries, Libraries, Archives, Museums) sector.
Year(s) Of Engagement Activity 2020
URL https://www.eventbrite.co.uk/e/glam-digital-champions-digital-lunch-january-2020-tickets-89946158381...
 
Description Case Study OED API: Exploring word meaning in historical texts with computational methods 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post for the OED case studies series
Year(s) Of Engagement Activity 2021
URL https://public.oed.com/blog/case-study-oed-api/
 
Description Catching up with maps 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post on the Living with Machines website to provide a high-level update on the maps-related work in the project.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/catching-up-with-maps/
 
Description Code and Coffee ?? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post describing an internal project activity aimed at facilitating collaboration
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/code-and-coffee/
 
Description Collecting annotations from British Library staff 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an event held with British Library staff
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Computational Approaches to Ordnance Survey Maps blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This blog post introduces the preliminary work of the "Space & Time Lab" in Living with Machines, which experimented with computer vision methods for studying large sets of historical, digitized maps. With 179 page views, it generated several conversations with external researchers about our use of these methods in the humanities context.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-space-and-time-lab/
 
Description Computer Vision for Digital Heritage SIG 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post on the Living with Machines website announcing the new Computer Vision for Digital Heritage SIG.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/computer-vision-for-digital-heritage/
 
Description Computer Vision for the Humanities workshop (Warwick University) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This workshop aims to provide an introduction to computer vision aimed for humanities applications. In particular this workshop focuses on providing a high level overivew of machine learning based approaches to computer vision focusing on supervised learning. The workshop includes discussion on working with historical data. The materials are based on in progress Programming Historian lessons.
Year(s) Of Engagement Activity 2021
URL https://zenodo.org/record/4746493
 
Description Conference Roundtable: The Future of Spatial History for Spatial Humanities 2021/DHangouts 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Roundtable discussion on the future of spatial history with K. McDonough, J. Taylor, and L. Scholz, chaired by I. Gregory for the Spatial Humanities 2021 conference and presented as part of the DHangout series hosted by Lancaster University. Audience of about 35 people with conversation about the future of computational spatial historical research.
Year(s) Of Engagement Activity 2021
URL https://youtu.be/60aT8J4hMAA
 
Description Convening the Applied Data Analysis strand at the Digital Humanities at Oxford Summer School (July 2023) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This strand teaches how to manipulate, analyse and explore data from the Humanities and the cultural sector. It is aimed at both GLAM professionals and academics, particularly those in the Arts & Humanities. It introduces to both theoretical (descriptive statistics, modelling) and practical aspects (Python data analysis stack) of applied data analysis.
Year(s) Of Engagement Activity 2023
 
Description Convening the Text to Tech strand at the Digital Humanities at Oxford Summer School (July 2023) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This hands-on workshop offers an introduction to natural language processing in Python, from processing texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. It is aimed at both GLAM professionals and academics, particularly those in the Arts & Humanities.
Year(s) Of Engagement Activity 2023
URL https://web.cvent.com/event/58fc430e-5294-4919-a7a3-c2b14f81a059/websitePage:4745b3f6-aba6-4f03-ada6...
 
Description Convening the Text2Tech Strand at the Digital Humanities at Oxford Summer School 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact We organized the Text2Tech at the Digital Humanities Summer School at the University of Oxford. Our week-long course provided an introduction to text mining with Python and was attended by 40 students, most postgraduates or academic staff.
Year(s) Of Engagement Activity 2022
URL https://eng.ox.ac.uk/events/dhoxss-2022/
 
Description Course 107 'Data Visualisation for Cultural Heritage Collections': British Library Digital Scholarship Training Programme 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Third sector organisations
Results and Impact In May 2020, Olivia Vane taught the rebooted Course 107 'Data Visualisation for Cultural Heritage Collections' for the British Library Digital Scholarship Training Programme: internal training in digital methods for British Library staff. The course was delivered over 2 sessions (4.5hrs in total) and included presentations and exercises with British Library datasets. It was taught over Zoom + Slack.
Year(s) Of Engagement Activity 2020
 
Description Crowdsourcing tasks 'What's that machine? Describe it!' and 'What's that machine? Classify it!' 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Building on the lessons learnt from earlier experiments, in early December we launched two new crowdsourcing projects with devised in collaboration with researchers including historians and computational linguists. These projects aimed to integrate linguistic research questions with tasks that encouraged volunteers to engage with social and technological history in the pages of 19th century newspapers.

As part of the launch process we applied to become an official Zooniverse project, which included separate reviews by Zooniverse staff and volunteers. We tweaked the interfaces as a result, and were delighted to be recognised as an official Zooniverse project.

Nearly 10,000 tasks were completed by over 700 registered volunteers (and countless anonymous volunteers) within a week.
Year(s) Of Engagement Activity 2020
URL https://www.zooniverse.org/projects/bldigital/
 
Description D3 JavaScript visualisation in a Python Jupyter notebook 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post describing how to combine JavaScript, the visualisation library D3.js and Python Jupyter notebooks. Accompanying notebook code was published with this blogpost.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/d3-javascript-visualisation-in-a-python-jupyter-notebook/
 
Description Daniel Van Strien: Flyswot: garden-variety machine learning applications conference presentation 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Conference presentation "Flyswot: garden-variety machine learning applications" at the ai4lam conference. Presenters: Daniel van Strien, Digital Curator at the British Library, Andrew Longworth, Digitisation Project Analyst at the British Library, Catherine Cronin, The Heritage Made Digital Team at the British Library
Year(s) Of Engagement Activity 2021
URL https://www.bnf.fr/en/program-international-conference-les-futurs-fantastiques-december-8-10-2021
 
Description Daniel van Strien AI4LAM webinar "In conversation with Jeremy Howard - upskilling to better navigate AI" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Webinar organised by Daniel van strien as part of the AI4LAM teaching and learning working group and the AI4LAM Au/ANZ chapter. The webinar hosted invited speaker Jeremy Howard from fastai to speak on the topic of making matching learning accessible to people working in libraries.
Year(s) Of Engagement Activity 2022
URL https://www.eventbrite.com/e/in-conversation-with-jeremy-howard-upskilling-to-better-navigate-ai-tic...
 
Description Daniel van Strien The Carpentries: Introduction to AI for GLAM Workshop at AI4LAM conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Deliver of 'The Carpentries: Introduction to AI for GLAM" workshop online as part of the AI4LAM conference.
Year(s) Of Engagement Activity 2021
URL https://www.bnf.fr/en/agendaEN/workshops-tutorials-les-futurs-fantastiques-3rd-conference-about-arti...
 
Description Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact A workshop held for British Library staff on using Computer Vision methods with heritage data including historic map collections.
Year(s) Of Engagement Activity 2020
 
Description Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop on using Computer Vision methods with historical collections held at the Create centre in Amsterdam University.
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/events/digital-history-workshop-maps-as-data-analysing-historic...
 
Description Daniel van Strien, Katherine McDonough, Daniel Wilson presented at Victorian Data Conference, University of Virginia, November 15-16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Three Living with Machines members presented on a session about "Living with Bias" at the Victorian Data conference, the first gathering of nineteenth-century studies scholars using digital methods in their work. Attended by about 100 researchers, our presentation both introduced Living with Machines to this largely US-based audience and generated several connections which have already resulted in visits to the Turing/BL in London in 2020 (including the faculty director of the University of Virginia Scholar's Lab, Alison Booth, who was a co-host of this conference).
Year(s) Of Engagement Activity 2019
URL http://data-caucus.herokuapp.com/conference-cfp
 
Description Data Study Group on smart monitoring for conservation areas 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Data Study Groups (DSG) are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Kasra Hosseini and Mariona Coll Ardanuy were the principal investigators of a DSG with the World Wide Fund for Nature (WWF) on "Smart monitoring for conservation areas". The methods explored are closely related to methods directly applicable to Living with Machines datasets.
Year(s) Of Engagement Activity 2019
URL https://www.turing.ac.uk/research/publications/data-study-group-final-report-wwf
 
Description David Beavan and James Hetherington contributing to Royal Society 'Dynamics of data science skills' 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Contribution to report - see link.
Year(s) Of Engagement Activity 2019
URL https://royalsociety.org/topics-policy/projects/dynamics-of-data-science/
 
Description David Beavan and Lydia France Living with Machines Distributed Conference panel 'AI Beyond STEM: digital skills to unleash the power of data science and AI for all' 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Expert panel session with international guests online to an international audience
Year(s) Of Engagement Activity 2023
URL https://livingwithmachines.ac.uk/event/ai-beyond-stem-digital-skills-to-unleash-the-power-of-data-sc...
 
Description David Beavan invited 'floating expert' and Mia Ridge, Dr. Katherine McDonough, Dr. Kaspar Beelen and Dr. Kasra Hosseini (project collaborator) invited participants at Computational Archival Science Workshop: Exploring Data, Investigating Methodologies, The National Archives, 20-21 June 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact About 100 people attended this event where Kaspar Beelen and Katie McDonough presented the keynote lecture on bias in digitized archival collections being used in the Living with Machines project. The international audience included GLAM professions and students from the US, UK, and elsewhere in Europe, and fostered conversations about the role of GLAM institutions in collaborating with researchers to develop best practices for creating, preserving, and making accessible digitised and born digital collections.
Year(s) Of Engagement Activity 2020
URL https://blog.nationalarchives.gov.uk/computational-archival-science-cas-exploring-data-investigating...
 
Description David Beavan invited inaugural talk at inaugural Humanities of Festival at University of Georgia, US 'Beyond Digital Humanities: Weaving Humanities Research Software Engineering and AI' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 50 staff and students attended talk, good engagement, have hosted UGA staff in Turing and are planning bilateral training for postgrads
Year(s) Of Engagement Activity 2023
URL https://willson.uga.edu/public-humanities/uga-humanities-council/2023-uga-humanities-festival/
 
Description David Beavan invited presentation at Software Development in Digital Humanities Labs and Projects, University of Sussex, 30 July 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description David Beavan invited talk at National library of Scotland Focused tech development delivering enhanced collections data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact David Beavan invited talk given to National Library of Scotland (NLS) internal professional seminar series
Year(s) Of Engagement Activity 2020
 
Description David Beavan led, Mia Ridge, Barbara McGillivray participated in panel discussion 'Data Science & Digital Humanities: new collaborations, new opportunities and new complexities' at Digital Humanities 2019 conference, Utrecht, July 11, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This panel highlights the emerging collaborations and opportunities between the fields of Digital Humanities (DH), Data Science (DS) and Artificial Intelligence (AI). It charts the enthusiastic progress of the Alan Turing Institute, the UK national institute for data science and artificial intelligence, as it engages with cultural heritage institutions and academics from arts, humanities and social sciences disciplines. We discuss the exciting work and learnings from various new activities, across a number of high-profile institutions. As these initiatives push the intellectual and computational boundaries, the panel considers both the gains, benefits, and complexities encountered. The panel latterly turns towards the future of such interdisciplinary working, considering how DS & DH collaborations can grow, with a view towards a manifesto. As Data Science grows globally, this panel session will stimulate new discussion and direction, to help ensure the fields grow together and arts & humanities remain a strong focus of DS & AI. Also so DH methods and practices continue to benefit from new developments in DS which will enable future research avenues and questions.
Year(s) Of Engagement Activity 2019
URL https://dev.clariah.nl/files/dh2019/boa/0364.html
 
Description David Beavan presented at Turing Innovation Symposium, hosted by Accenture, Dublin, 3-4 April 2019. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview of Living with Machines for Turing Innovation Showcase in Dublin 2019.
Year(s) Of Engagement Activity 2019
 
Description David Beavan presented talk 'Potential Uses of a Registry of Digitised Works: By scholars' at Global Digitised Dataset Network, British Library, 10 June 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Lessons from the project on uses of a registry of digitised works
Year(s) Of Engagement Activity 2019
URL https://gddnetwork.arts.gla.ac.uk/
 
Description Deep Learning approaches in GIScience session at the Royal Geographical Society Annual Conference: Maps and Machines presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation about computer vision for maps research at the annual Royal Geographical Society conference. Virtual audience of about 30 people.
Year(s) Of Engagement Activity 2021
URL https://sdesabbata.github.io/deep-learning-giscience/
 
Description Deep learning reading group 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing an internal reading group on deep-learning methods being used by the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/deep-learning-reading-group/
 
Description Developing Data Study Group with TNA on (web) archives and social attitudes towards new technologies, initiated by Barbara McGillivray and David Beavan 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Data Study Groups are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Beavan and McGillivray co-organised a DSG with the National Archives on "Discovering topics and trends in the UK Government Web Archive"
Year(s) Of Engagement Activity 2019
URL https://www.turing.ac.uk/events/data-study-group-december-2019
 
Description Diachronic and diatopic word embeddings from British historical newspapers (AIUCD conference) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Poster presentation on ongoing experiments with diachronic and diatopic word embeddings trained on historical British newspaper collections (1830-1889).
Year(s) Of Engagement Activity 2023
URL https://doi.org/10.5281/zenodo.7892460
 
Description Did Machines Drive History? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing the first minimum research outcome of the language lab, in which we explored to what extent machines were being seen as agents able to drive change.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/did-machines-drive-history/
 
Description Digital Humanities and Research Software Engineering working together: some examples of a fruitful collaboration from the Living with Machines project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Federico Nanni and Kasra Hosseini (from the Research Engineering group at the Alan Turing Institute) and Kaspar Beelen and Mariona Coll Ardanuy (postdocs in the Living with Machines project) shared their experience in working together in projects at the intersection of software engineering, computational linguistics and digital humanities, as part of the KQ Codes Technical Socials at University College London. About 20 participants attended.
Year(s) Of Engagement Activity 2021
URL https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials
 
Description Digital Humanities at Oxford Summer School Virtual Event 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "From Text to Tech: Text mining and the humanities, using language models to find living machines in nineteenth-century books" at the 2020 virtual edition of Digital Humanities at Oxford Summer School, with 270 attendants.
Year(s) Of Engagement Activity 2020
URL https://www.dhoxss.net/dhox2020-virtual-event-report
 
Description Digital Humanities at Oxford Summer School Virtual Event 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Interactive workshop on "An introduction to natural language processing with Python", organised by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. Participants learned how to use Python programming for powerful text processing in the Humanities, from cleaning texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. There were 60 attendants.
Year(s) Of Engagement Activity 2020
URL https://www.dhoxss.net/dhox2020-virtual-event-report
 
Description Digital Humanities at Oxford Summer School Virtual Event 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Interactive workshop on "Language models and their use in the digital humanities", by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. This workshop offered a basic introduction to language models using python. Participants learned how to use and interpret different language models and to train their own models. There were 16 participants.
Year(s) Of Engagement Activity 2021
URL https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school
 
Description Digital Humanities at Oxford Summer School Virtual Event 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "Models of Language: Using algorithms to explore the past" at the 2021 virtual edition of Digital Humanities at Oxford Summer School. There were 450 participants.
Year(s) Of Engagement Activity 2021
URL https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school
 
Description Echoing Through Time: New Tunes for Old Words 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post about the process of recording ballads from the British Library's collections for use in the exhibition, and their subsequent release on Soundcloud.
Year(s) Of Engagement Activity 2022
URL https://livingwithmachines.ac.uk/echoing-through-time-new-tunes-for-old-words/
 
Description Emma Griffin invited presentation: International symposium - 'Dartmouth and the World', Dartmouth University, 10-20 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Gave talk on "Life and Living Standards in Britain's Industrial Revolution"
Year(s) Of Engagement Activity 2019
 
Description Emma Griffin invited presentation: Oregon State University, Centre for the Humanities, 7 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk on "Home Economics: Food, Money, and Emotions in Victorian Britain"
Year(s) Of Engagement Activity 2019
 
Description Engagement focused website, blog or social media channel - Blog post: Turing Researcher Spotlight - David Beavan 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Turing Researcher Spotlight - David Beavan. Senior Research Software Engineer David Beavan is using AI to unlock new insights into the Industrial Revolution.
Year(s) Of Engagement Activity 2022
URL https://www.turing.ac.uk/people/spotlights/david-beavan
 
Description Finding words in maps, part 2: seeing the results 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post about evaluating the 'Strabo' tool (software for transcribing text in digitised historical maps) on our map data through visualisation.
Year(s) Of Engagement Activity 2019
URL https://livingwithmachines.ac.uk/finding-words-in-maps-part-2-seeing-the-results/
 
Description Free Thinking: Archiving, curating and digging for data 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact BBC Radio 3 broadcast:

What stories are being uncovered by people working behind the scenes at museums and institutions? Lisa Mullen finds out talking to Tessa Jackson - Conservator;
David Beavan - Senior Research Software Engineer, Turing Institute and Matt Harle - Archivist and curator at the Barbican.

Barbara Hepworth: Art & Life runs at the Hepworth Wakefield from 21 May 2021 to 27 Feb 2022. The gallery also runs a Hepworth Research Network in partnership with the Department of History of Art at the University of York and the School of Art, Design and Architecture at the University of Huddersfield.
https://hepworthwakefield.org/our-story/hepworth-research-network/people/

Matthew Harle is an archivist working with the Barbican as it prepares for its 40th anniversary so is assembling an archive alongside the Guildhall School of Music and Drama
https://www.barbican.org.uk/our-story/our-archive/about-the-archive
https://matthewharle.com/Barbican-Archive

The Alan Turing Institute https://www.turing.ac.uk/ is the national institute for data science and artificial intelligence running a host of research projects into topics including AI, Public Policy and Living with Machines - a project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution.
https://livingwithmachines.ac.uk You can hear more from historian Emma Griffin in this conversation about Understanding the Industrial Revolution https://www.bbc.co.uk/programmes/p081y7h4
Year(s) Of Engagement Activity 2021
URL https://www.bbc.co.uk/programmes/m000vydf
 
Description G. Solomon and J. Rhodes 'Work, Occupational Change, and Technological Adoption: Britain, 1851-1911', European Social Science History Conference (ESSHC) 2023, Gothenburg, Sweden, 13/04/23 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation of work on occupational change using digitised census data to ESSHC 2023 in Gothenburg, Sweden. Gained feedback from discussant and auidence members, and broadened engagement with our methodological approaches (nominal linkage and street geo-coding)/research findings (human capital formation in the bicycle industry). Generated significant discussion, and resulted in future plans for participation in a 'best practice' workshop.
Year(s) Of Engagement Activity 2023
URL https://esshc.socialhistory.org/conference/programme?day=95&time=328&session=5399&textsearch=solomon...
 
Description Giorgia Tolfo & Timothy Hobson online poster presentation at Data for History (June 2021): Modelling Time, Places, Agents (Berlin) entitled "Supporting an interdisciplinary research agenda through meta-modelling. The case of LwM" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A poster was presented for discussion with academic conference attendees on the subject of the "meta-modelling" approach taken to conceptual data modelling within the LwM project.
Year(s) Of Engagement Activity 2021
 
Description Hacking 23 years of government history: An example from The UK Government Web Archive 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Turing blog:

Web archives provide a key resource for the public. They allow us to access a wide range of data reflecting all areas of a society but, as they are large and meticulously maintained datasets, they can be daunting and difficult to navigate.

The Alan Turing Institute and The National Archives co-organised a Data Study Group challenge. Data Study Groups (DSGs) are events hosted by the Turing, which bring together some of the top talent from data science, artificial intelligence, and wider fields from across the world, to analyse real-world data science challenges.

The culmination of that work is now available to read via the published Data Study Group report 'Discovering topics and trends in the UK government web archive'
Year(s) Of Engagement Activity 2021
URL https://www.turing.ac.uk/blog/hacking-23-years-government-history-example-uk-government-web-archive
 
Description Historical Hypothesis Generation (BlogPost) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Long blog post outlining an element of our interdisciplinary method.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/historical-hypothesis-generation-hypgen/
 
Description Hunting for Treasure: Living with Machines and the British Library Newspaper Collection. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation "Hunting for Treasure: Living with Machines and the British Library Newspaper Collection." at the Impresso Workshop in Lausanne (held online)
Year(s) Of Engagement Activity 2020
URL https://impresso.github.io/eldorado/online-program/
 
Description IIIF conference lightning talk: IIIF and machine learning inference: a love story? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Lightning talk as part of the IIIF conference discussing the use of IIIF and compute3r vision to work with a Library of Congress collection of digitised newspapers.
Year(s) Of Engagement Activity 2021
URL https://iiif.io/event/2021/annual_conference/
 
Description Implications of AI for Libraries presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A presentation as part of a post-graduate library science talk on the implications of AI drawing examples for the Living with Machines project.
Year(s) Of Engagement Activity 2020
 
Description Information+ Conference talk: Olivia Vane, Kasra Hosseini, Katherine McDonough and Daniel CS Wilson - 'Maps in Time: Visualising the historical Ordnance Survey' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A division is often made between maps and timelines. This presentation from the Living with Machines project explores combining the two, visualising a dataset of 130,000 maps from the early Ordnance Survey (OS), Britain's national mapping agency. It was the OS who, from the early 19th century, created the first comprehensive, detailed and accurate picture of Great Britain. We show how animated data graphics can bring the story of the maps to life for a popular audience. We also visualise the data by space and time to support analysis in research.
Year(s) Of Engagement Activity 2021
URL https://vimeo.com/598429189
 
Description Intro to D3 session for Alan Turing Institute REG 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Teaching an 'Introduction to D3.js' for the Alan Turing Institute Research Engineering Group lunchtime tech talks. 2hr session: presentation and going through tutorials.
Year(s) Of Engagement Activity 2020
 
Description Introduction to Computer Vision for Digital Heritage using Living with Machines research 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Presentation as a part of the day-long conference organized by Polly Hudson to review Colouring London and related research for applications with Historic England and adjacent agencies.
Year(s) Of Engagement Activity 2021
URL https://colouringlondon.org/
 
Description Introduction to Jupyter Notebooks: the weird and the wonderful 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact An online workshop focused on potential uses of Jupyter Notebooks in a GLAM (Galleries, Libraries, Archives and Museums) settings.
Year(s) Of Engagement Activity 2021
URL https://github.com/Living-with-machines/Jupyter-Notebooks-The-Weird-and-Wonderful
 
Description Introduction to Python, with Mariona Coll Ardanuy, July 19th 2019, organised by Mariona Coll Ardanuy for Turing Community 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact 4-hour introductory course to programming for the Humanities, with a focus to text processing and data wrangling (e.g. opening and working with documents and file paths). The feedback was very positive. Participants got acquainted with the basics of Python programming, which they have been able to apply to the project in multiple occasions.
Year(s) Of Engagement Activity 2019
 
Description Invited lecture, Luxembourg Centre for Contemporary and Digital History (C2DH) Hands-on History lecture series 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact My talk, 'Crowdsourcing in Living with Machines: crowdsourcing for engagement meets data science research', sparked a rich discussion afterwards
Year(s) Of Engagement Activity 2022
URL https://www.c2dh.uni.lu/events/crowdsourcing-living-machines-crowdsourcing-engagement-meets-data-sci...
 
Description Invited talk 'Living with Machines', University of Aarhus 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk about the Living with machines project for DH colleagues at Aarhus
Year(s) Of Engagement Activity 2022
 
Description Invited talk on Computer Vision research in LwM for the Association of Geographic Information-Scotland. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact About 150 people attended a talk about Living with Machines research with historical maps.
Year(s) Of Engagement Activity 2021
 
Description Invited talk, Princeton University, 'Crowdsourcing and the Humanities' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk and panel discussion for an event with the Center for Research Data and Digital Scholarship at University of Pennsylvania Libraries, The Center for Digital Humanities at Princeton University Library, the Princeton Geniza Lab, and the Zooniverse, attended by c40 people. The panel and event sparked extended discussion on social media.
Year(s) Of Engagement Activity 2021
URL https://genizalab.princeton.edu/crowdsourcing-and-the-humanities
 
Description Invited talk: Crowdsourcing in cultural heritage lecture for Institut für Kunstgeschichte, LMU München 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact An invited talk for a German seminar group.
Year(s) Of Engagement Activity 2021
 
Description Invited talk: User Experience (UX) for Citizen Science , iDigBio event 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited to speak at the event 'Biodiversity Digitization: Celebrating a decade of progress' in the session 'Innovations: Strategy & Coordination'. My talk outlined the importance of user experience design (UX) for increasing diverse participation in citizen science projects.
Year(s) Of Engagement Activity 2021
URL https://www.idigbio.org/wiki/index.php/Biodiversity_Digitization:_Celebrating_a_decade_of_progress
 
Description J. Rhodes and G. Solomon 'New perspectives on occupational change: Britain, 1851-1911', North American Conference on British Studies (NACBS) 2022, Chicago, 11/11/2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation of work on occupational change using digitised census data to NACBS 2022 in Chicago. Aim was to get feedback on our paper from discussant and audience members and to promote our new approaches (geocoding and nominal linkage). Audience questions and discussant's comments provided important feedback on how to shape the paper for future publication. The session raised awareness of LwM's work on census data within the historical discipline.
Year(s) Of Engagement Activity 2022
 
Description J.Rhodes, J. Lawrence, D. Wilson, K. Beelen, K. McDonough, "Beyond the Tracks" presentation at DH2022 (online/Tokyo), 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 25 participants joined the panel where we presented a paper "Beyond the tracks" for the Digital Humanities 2022 conferencel.
Year(s) Of Engagement Activity 2022
URL https://dh2022.dhii.asia/dh2022bookofabsts.pdf
 
Description Jon Lawrence, Inter-Disciplinary Research Programme Assessor for British Academy - 'The Humanities and Social Sciences Tackling the UK's International Challenges' (2019) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Assessing projects under the heading " The Humanities and Social Sciences Tackling the UK's International Challenges"
Year(s) Of Engagement Activity 2019
URL https://www.thebritishacademy.ac.uk/programmes/tackling-uk-international-challenges
 
Description K. Beelen, K. McDonough, "Maps and Machines: using computer vision to analyze the geography of industrial change (1790-1920)", University of Aberdeen DH Seminar, 26 Oct 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presentation of maps research to DH community at the University of Aberdeen.
Year(s) Of Engagement Activity 2021
 
Description K. Beelen, K. McDonough, DCS Wilson, J. Lawrence, K. Westerling, "The 'Environmental Scan' at work: radical contextualisation of newspaper collections for new historical research," DH2023 Long Paper, 10-14 July. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 30 people attended this presentation at the 2023 DH conference and it has sparked conversation about the Environmental Scan method for non-UK collections as well as discussion about the accessibility of historical newspaper collections.
Year(s) Of Engagement Activity 2023
 
Description K. Beelen, M. Coll Ardanuy and F. Nanni: "Breaking (the?) news in the nineteenth century", Knowledge, Information and Data Science (KIDS) group, University Collect London (UCL), London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers.
Year(s) Of Engagement Activity 2022
 
Description K. Beelen, M. Coll Ardanuy and F. Nanni: "Living with Machines: Analysing Digital Heritage at Scale", Digital Humanities Lab Exeter, University of Exeter 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers.
Year(s) Of Engagement Activity 2022
URL https://www.exeter.ac.uk/news/events/details/index.php?event=11894
 
Description K. McDonough "Maps as Data," OBTIC Séminaire, Paris, France, 3 June 2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk to Paris-based research group working on text analysis with historical documents, attended by about 30 people.
Year(s) Of Engagement Activity 2022
URL https://obtic.sorbonne-universite.fr/actualite/je-analyse-spatiale-des-textes-litteraires/
 
Description K. McDonough, "DH Careers: Beyond the Professoriate," CESTA, Stanford University, 15 Feb. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact 30 PhD students at Stanford University attended this workshop to discuss career opportunities in the digital humanities.
Year(s) Of Engagement Activity 2022
URL https://cesta.stanford.edu/events/dh-careers-beyond-professoriate
 
Description K. McDonough, "Maps as Data for Open Historical Research," Roundtable on AI and the Historical Profession: Applications and Implications, American Historical Association Annual Meeting, San Francisco, CA 4-7 Jan. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk about the opportunities for using maps and AI methods in historical research at the American Historical Association. Attended by around 50 people.
Year(s) Of Engagement Activity 2024
URL https://aha.confex.com/aha/2024/meetingapp.cgi/Session/25011
 
Description K. McDonough, "Maps as [Open] [Humanities] Data: From Access to Analysis," Reimagining Industry/Academic/Cultural Heritage Partnerships in AI Workshop, AEOLIAN Network (Artificial Intelligence for Cultural Organisations), [virtual] 25 Oct. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation to international audience about the maps research in Living with Machines, in particular the issues around ethical use of heritage resources in digital research.
Year(s) Of Engagement Activity 2021
URL https://www.aeolian-network.net/events/workshop-2/
 
Description K. McDonough, D. Wilson, K. Beelen, G. Solomon, "Historians Among the Machines: From Reproducible Computational Experiments to Persuasive Historical Arguments" session, American Historical Association Annual Meeting, San Francisco, CA 4-7 Jan. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentations from 4 former Living with Machines PDRAs at the American Historical Association on a panel dedicated to the project. Chaired by Lauren Tilton (University of Richmond).
Year(s) Of Engagement Activity 2024
URL https://aha.confex.com/aha/2024/meetingapp.cgi/Session/25012
 
Description K. McDonough, K. Hosseini, "Maps as Data" for Turing Catch Up Monthly Meeting, Jan 24 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Lightning talk about Maps research within Living with Machines during the monthly Turing Catch Up. Resulted in several inquiries about new applications, further research with MapReader.
Year(s) Of Engagement Activity 2022
 
Description Kaspar Beelen "Surveying the Newspaper Landscape" (CREATE Salon, February) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at the University of Amsterdam attended by ca. 20 people. It was part of the "Salon" series organized by CREATE Amsterdam (Julia Noordegraaf).
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/
 
Description Kaspar Beelen Presentation for the British Library News Collection Group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presentation on the digitization of the Newspaper Press Directories and how this feeds into understanding the shape and contours of digital newspaper collections.
Year(s) Of Engagement Activity 2020
 
Description Kaspar Beelen and Katherine McDonough Keynote presentation the Computational Archival Science symposium "Surveying the Land" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Keynote presentation by Kaspar Beelen and Katherine McDonough at the Computational Archival Science Symposium, organized at the Alan Turing Insitute (January 2020).
Year(s) Of Engagement Activity 2020
URL https://www.turing.ac.uk/events/computational-archival-science-cas-symposium
 
Description Kaspar Beelen, Invited talk "Stereotypes in Newspaper data" at the Dutch National Library Research Week, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Presentation at the Dutch Royal Library (KB) to report on the progress of my Research in Residence programme. It was part of the KB "Research Week" and was the most popular in terms of people signing up.
Year(s) Of Engagement Activity 2019
 
Description Kaspar Beelen, Panel discussion on Coding Literacy in the Digital Humanities, at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Participation in a round table on the topic of "Coding Literacy in the Humanities" (organized by Marijn Koolen, Liliana Melgar and Mari Wigham). The round table included a presentation with different experts (Joris van Zundert, Elli Bleeker, Sally Chambers) and discussion with an audience of Digital Humanities experts.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2020/01/DH_Benelux_2019_paper_25.pdf
 
Description Kaspar Beelen, Presentation on "Bias in the British Newspaper Archive" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact 15 minutes paper presentation on the work the emerged out of the Sources Lab, focussed on understanding the newspaper landscape.. Attended by ca. 25 people, from various backgrounds (DH researchers, librarians,)
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2019/08/DH_Benelux_2019_paper_33.pdf
 
Description Kaspar Beelen, Presentation on "The Agency of Machines" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presentation reporting on the "The Agency of Machines" at the poster session of Digital Humanities Benelux, 2019. It involved discussion with many interested attendants of the conference.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/program/
 
Description Kaspar Beelen, Seminar on History and Text, Antwerp University, November 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact Presentation on the use of Text Mining for History. Part of the course "History and Language" (BA2) organised by Marnix Beyen (University of Antwerp).
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Presentation to about 40 people at the DH Seminar at Exeter was a great opportunity to make contact with the expert community there and introduce them to our ongoing work.
Year(s) Of Engagement Activity 2019
URL http://www.exeter.ac.uk/news/events/details/index.php?event=9637
 
Description Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Conversation to plan for future collaboration with researchers working at the cutting edge of computer vision for historical maps in the United States.
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited talk to review applications of research on qualitative spatial relations in the Living with Machines project. Question session offered an opportunity to learn about related research in the UK and to share our ongoing work with leaders in the field.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/spatial-relationships-in-text-as-data-tickets-76259685773
 
Description Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 60 people attended a presentation at Stanford University about Living with Machines. This conversation has created substantive links to the DH community at Stanford and there is continued interest in collaborating with us in the future.
Year(s) Of Engagement Activity 2019
URL https://cesta.stanford.edu/events/cesta-seminar-dr-katie-mcdonough
 
Description Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented Living with Machines research on computer vision with maps during a roundtable on applications of AI in GLAM institutions, generating conversation with an international audience about working with visual heritage materials at scale. The workshop offered GLAM staff, researchers, and policy leaders an opportunity for hands-on experience in computer vision, which has translated into invitations for collaboration and additional teaching opportunities.
Year(s) Of Engagement Activity 2019
URL https://fantasticfutures.stanford.edu/
 
Description Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Delivered a talk about using computer vision techniques to analyse digitised historical maps at scale.
Year(s) Of Engagement Activity 2020
 
Description LWM listed at Genealogy Stories "10 Websites for the History of Ordinary People" 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Article in Medium listing good websites the public to find out more about the history of ordinary people included Living with Machines as a recommended source.
Year(s) Of Engagement Activity 2021
URL https://genealogystoriesuk.medium.com/10-websites-for-the-history-of-ordinary-people-9ecc8b1b4832
 
Description Lab Talk for Workshop on Visualization for the Digital Humanities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Olivia Vane gave a lightning talk at the online 5th Workshop on Visualization for the Digital Humanities about the British Library Digital Scholarship department.
Year(s) Of Engagement Activity 2020
URL http://vis4dh.org/
 
Description Learn more about Living with Machines at events this winter 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A blog post to promote exhibition-related events to be held at Leeds City Museum and online.
Year(s) Of Engagement Activity 2022
URL https://blogs.bl.uk/digital-scholarship/2022/10/learn-more-about-living-with-machines-at-events-this...
 
Description Learn more about what AI means for us at Living with Machines events this autumn 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post in support of events co-organised by two Co-Is and held at Leeds City Museum as part of the Leeds Digital Festival.
Year(s) Of Engagement Activity 2022
URL https://blogs.bl.uk/digital-scholarship/2022/09/learn-more-about-living-with-machines-at-our-events....
 
Description Library Carpentry session 1 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 1 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 2 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 2 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 3 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 3 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Library Carpentry session 4 (workshop) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop 4 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff
Year(s) Of Engagement Activity 2020
 
Description Linking Geo-Data through Test and Play 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Tutorial on DeezyMatch, troubleshooting session, and final roundtable to discuss the tools useful in linking geospatial data from historical sources.
Year(s) Of Engagement Activity 2020
URL https://github.com/LinkedPasts/LaNC-workshop
 
Description Living with Machine Documentary episode 2: The digitisation Process 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt.
Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC). This episode focuses on the digitisation process.
Find out more here: https://bit.ly/49sUXjs
Year(s) Of Engagement Activity 2024
URL https://www.youtube.com/watch?v=aGF343ketqw&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=2&t=4s
 
Description Living with Machines Documentary Episode 3: Computational methods, infrastructure, and skills 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt.
Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC). This episode focuses on computational methods, infrastructure, and skills.
Find out more here: https://bit.ly/49sUXjs
Year(s) Of Engagement Activity 2024
URL https://www.youtube.com/watch?v=cmW10eK-ojs&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=3&t=5s
 
Description Living with Machines Documentary episode 1: On Collaboration 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt.
Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC).
Find out more here: https://bit.ly/49sUXjs
Year(s) Of Engagement Activity 2023
URL https://www.youtube.com/watch?v=A__ZJgw4_00&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=1&t=48s
 
Description Living with Machines Documentary episode 4: The Environmental Scan 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt.
This episode focuses on the method of the 'environmental scan', which quantifies how the digitisation policies of the past (i.e., what gets into digitised corpora) can bias the outcomes of analyses we run.
Year(s) Of Engagement Activity 2024
URL https://www.youtube.com/watch?v=vTc4S3Zx9IA&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=4
 
Description Living with Machines OCR hack 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an internal 'hack' event focused on OCR.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/living-with-machines-ocr-hack/
 
Description Living with Machines book launch 'Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The event was an online roundtable discussion, led by hosts Professor Jane Winters and Professor James Smithies, with the authors, Ruth Ahnert, Emma Griffin, Mia Ridge and Giorgia Tolfo. It celebrated and promoted the newly published book 'Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project' (available open access by Cambridge University Press as part of the Elements Series). It was part of AI UK 2023. The Alan Turing Institute's national showcase of data science, machine learning and artificial intelligence research and innovation. At a series of events between 6 - 31 March 2023, AI UK Fringe brings together leaders in academia from across the UK's AI ecosystem to demonstrate, exhibit and update on their ground-breaking work.
196 Registrants, 70 online attendees, and recording of the session publicised at The Alan Turing Institute YouTube channel.
Year(s) Of Engagement Activity 2023
URL https://livingwithmachines.ac.uk/event/book-launch-collaborative-historical-research-in-the-age-of-b...
 
Description MapReader Launch 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The MapReader Launch showcased librarians, historians, and data scientists discussing how Living with Machines has used this newly developed software library to experiment with National Library of Scotland Ordnance Survey maps. Launch participants had a chance to test MapReader with Ordnance Survey maps from the NLS and the British Library. Chris Fleet (NLS) and Nicole Colemen (Stanford) presented keynotes.

On the afternoon of June 8, we heard from colleagues working on other open source, interdisciplinary projects that also explore historical map collections as primary sources. These projects are now featured as resources in The Alan Turing Institute's Computer Vision for Digital Heritage Special Interest Group new Tool Gallery.

The MapReader Launch brought together historians and others with an interest in using digitized map collections as primary sources for computational research. Collectively, we learned about and discussed ways to encourage more open research in this space through skill development and shared digital resources and infrastructure.

The impact of the Launch has been impressive: from enabling library curators to teach their own communities how to use MapReader with digitized map collections, motivating PhD and postdoctoral research with maps, and setting in motion future working collaborations with organisations like the Office of National Statistics, the University of Antwerp, Stanford Libraries, the National Archives (UK), and the French National Library, this exceptionally well-received event sets the stage for future MapReader research and development at an international level.
Year(s) Of Engagement Activity 2023
URL https://livingwithmachines.ac.uk/event/mapreader-launch/
 
Description MapReader Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact In this workshop, we jumpstarted making MapReader more accessible and easy to use by bringing those people who have had some exposure to it already together for some intensive, shared work. We focused on 3 main activities: testing MapReader on existing data to identify bugs and opportunities for simplifying or improving the code or library design; developing approaches for evaluating and analyzing MapReader outputs that are meaningful to humanities and some social science research; and document needs for tutorials and software documentation in order to make MapReader more accessible to specific user groups (e.g. historians, curators, geographers).

Impacts have included ongoing engagement with the software library as a research and teaching tool by invited participants, integration of comments into the MapReader roadmap, and significant improvements to MapReader functionality thanks to bug reporting.
Year(s) Of Engagement Activity 2023
 
Description MapReader day at ANHIMO 2023 Sorbonne Summer School in Paris 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talks and hands-on workshop introducing MapReader to computer science and digital humanities postgraduate students at the Summer School of Numerical Analysis of the History of the Sea and Oceans hosted by the Sorbonne University Alliance Ocean Institute and SCAI (Sorbonne Center for Artificial Intelligence) in Paris on 27 June 2023. Led by Katie McDonough, Andy Smith, and Daniel Wilson.

Impacts include extending re-use of MapReader among historians in France.
Year(s) Of Engagement Activity 2023
URL https://scai.sorbonne-universite.fr/public/events/view/a8046651d11c55bfbcd0/11
 
Description Maps as Data: A Humanistic Approach to Computer Vision for Large Map Collections 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation for the Unlocking Historical Maps of Southeast Asia Webinar Series, organized by Jane Jacobs at Yale-NUS in Singapore. The virtual workshop session was attended by 55 students, scholars, and librarians who are developing projects that use computational methods to study digitised map collections.
Year(s) Of Engagement Activity 2020
URL https://historicmapssea.commons.yale-nus.edu.sg/unlocking/
 
Description Mariona Coll-Ardanuy, Presentation at CogSci seminar at QMUL (13/06/2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk at the Cognitive Science group at Queen Mary University of London, presenting preliminary research on the language lab work for Living with Machines. There were very relevant comments, and interesting questions as well. A subsequent talk at the Cognitive Science seminar was planned, which will take place on 25 May 2020.
Year(s) Of Engagement Activity 2019
 
Description Mentions and promotion for Living with Machines Book Launch 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact The Living with Machines book launch was promoted in the following outlets:

https://dhandlib.org/2023/02/23/resource-collaborative-historical-research-in-the-age-of-big-data/
https://ai-uk.turing.ac.uk/fringe-events/
https://royalhistsoc.org/calendar/collaborative-historical-research-in-the-age-of-big-data-lessons-from-an-interdisciplinary-project/

https://twitter.com/LivingwMachines/status/1630994408173608960?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1630994408173608960%7Ctwgr%5E4b01f17d06b6894ae94988c544585ee6ecaea262%7Ctwcon%5Es1_c10&ref_url=https%3A%2F%2Fpublish.twitter.com%2F%3Fquery%3Dhttps3A2F2Ftwitter.com2FLivingwMachines2Fstatus2F1630994408173608960widget%3DTweet
Year(s) Of Engagement Activity 2023
 
Description Mia Ridge and Andre Piza, invited participants at AI and Storytelling workshop, Kings Digital Lab, Kings College London, Apr 1st 2019. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Researchers and industry reflected on on ways of collaborating in the field with particular attention to the challenges around engagement of Research Software Engineers, needed skills and project frameworks. Consolidated relationship between KDL and Living with Machines leading to a second meeting at the Turing with the KDL Director and 2 of their researchers with view of future collaboration.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge and Olivia Vane presented at KQ Codes Technical Socials at University College London: 'Research software engineering at one of the world's largest libraries', 20 February 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The Knowledge Quarter (KQ) Codes Technical Socials at UCL are informal events for anyone with an interest in the computational methods and technology behind research and innovation. They are an opportunity to get to know fellow practitioners, and to discuss and learn about useful tools and techniques which may help with your work.

We gave a presentation on research software engineering at the British Library, including a discussion of RSE roles on Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials
 
Description Mia Ridge initiated a meetup for scholars and institutions working with digitised newspapers for humanities research at Dh2019, Utrecht 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Group established to discuss the challenges and opportunities for scholars and institutions to collaborate using digitised newspaper collections
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge led panel discussion 'The Past, Present and Future of Digital Scholarship with Newspaper Collections' at Digital Humanities 2019 conference, Utrecht, July 10, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL http://www.openobjects.org.uk/2019/07/the-past-present-and-future-of-digital-scholarship-with-newspa...
 
Description Mia Ridge presented 'Living with "Living with Machines": navigating the digital shift at scale' paper accepted for DCDC, Discovering Collections, Discovering Communities, organised by The National Archives and Research Libraries UK (November 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk to cultural heritage audience at TNA about the project.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project Living with Machines to Alberta Comer, Dean and University Librarian, University of Utah (May 2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project at the Library of Congress's Digital Strategy Roundtable, Washington DC (June 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, 'Machine Learning and Digital Humanities' panel, University of Newcastle, Newcastle, September 5, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact As machine learning becomes more common across a wide range of digital solutions, and increasingly factors in our daily lives, it is also being used more frequently in humanities research projects. The possibilities of machine learning need to be understood by humanities researchers and the complexities of the problems investigated in the humanities by those working with machine learning technologies. The humanities can offer a wealth of historical data that presents new challenges to machine learning methodologies: historical records, pictorial representations, literary (or other) text. Recent Digital Humanities projects already employ some machine learning technology, such as with the development of Handwritten Text Recognition (HTR), but the diversification of the data investigated with machine learning approaches has the potential to lead the technology in new and unexpected ways with real-world applications. Panel members include: • Beatrice Alex (University of Edinburgh), • Noura Al-Moubayed (Durham University), • Mia Ridge (British Library), • Melissa Terras (University of Edinburgh).
Year(s) Of Engagement Activity 2019
URL https://n8cir.org.uk/events/machine-learning-and-digital-humanities/
 
Description Mia Ridge, invited presentation, British Library Data Projects workshop, London, August 19, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Consortium of European Research Libraries (CERL) Annual Seminar 2019, Göttingen, October 9, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, KCL / British Library Research Collaboration workshop, Kings College London, September 27, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Museums + AI Network workshop, Pratt Institute, New York, September 16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL https://www.openobjects.org.uk/2019/09/museums-ai-new-york-workshop-notes/
 
Description Mia Ridge, invited presentation, Princeton University Library, Princeton, September 13, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Research Libraries UK International Symposium on Digital Scholarship, London, October 14, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This symposium explored the nature and extent of digital scholarship occurring within research libraries across the international research library community. It brought together representatives from international research library associations, funders, the academic community, and global-library collectives to discuss areas of potential cross-sector and interdisciplinary collaboration, and the routes and networks through which this might be achieved. Mia Ridge presented on "Building capacity for digital scholarship at a research library: Living with Machines, and the impact of data science"
Year(s) Of Engagement Activity 2019
URL https://www.rluk.ac.uk/digital-scholarship-and-the-role-of-the-research-library-symposium-slides/
 
Description Mia Ridge, invited presentation, Wellcome Library, London, July 4, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, presentation, Library of Congress Machine Learning Summit, Washington DC, September 20, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Ridge touched on three main kinds of challenges: scale, operational and interdisciplinary, and
copyright. A larger scale requires new worflows and quickly grows expensive, operationalizing
raises the question of producing public-facing infrastructure, and copyright involves negotiating
complex rights issues.
Year(s) Of Engagement Activity 2019
URL https://labs.loc.gov/static/labs/meta/ML-Event-Summary-Final-2020-02-13.pdf?loclr=blogsig
 
Description Netherlands Film Festival 2020: Generous Interfaces panel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Olivia Vane took part in a panel at the Netherland Film Festival 2020 (run online because of the Covid pandemic) on Generous Interfaces: "In the Generous Interfaces panel we investigate alternative ways to search audiovisual collections, using De Open Beelden Browser ('The Open Images Browser'). How can you enjoy exploring archives even if you're not looking for anything in particular?".

Olivia gave a presentation and then participated in a panel discussion.
Year(s) Of Engagement Activity 2020
URL https://www.filmfestival.nl/en/collection/nff-conferentie-generous-interfaces/
 
Description New exhibition considers the human impact of rapid technological change in the 19th century 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Press release for the exhibition 'Living with Machines' at the Leeds City Museum, published at turing.ac.uk on the occasion of the opening.
Year(s) Of Engagement Activity 2022
URL https://www.turing.ac.uk/news/new-exhibition-considers-human-impact-rapid-technological-change-19th-...
 
Description Newspapers in 'Living with Machines' 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact An invited talk on the British Library's Newspaper collection and the Living with Machines project for Congre`s Me´dias 19 - Numapresse : Presses anciennes et modernes a` l'e`re du nume´rique, La BnF, 3 juin. 2022
Year(s) Of Engagement Activity 2022
URL https://figshare.com/articles/presentation/British_Library_Newspapers_and_Living_with_Machines/19963...
 
Description Olivia Vane, Katherine McDonough, Daniel van Strien, 21st Century Curator Talk (British Library staff talks), Maps and Machines: Using Computer Vision to Analyse the Geography of Industrialization (1780-1920), January 13, 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk onUsing Computer Vision to Analyse the Geography of Industrialization (1780-1920)
Year(s) Of Engagement Activity 2019
 
Description Panel discussion: Expanding and Enriching Metadata through Engagement with Communities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This panel discusses how cultural institutions are engaging various communities to co-create academic research and/or object metadata in order to increase representation and access to collections; highlighting how this is done in different ways to engage specific audiences and goals, i.e. graduate student assistantships, museum interactive experiences, crowdsourcing, and professional action groups.
Year(s) Of Engagement Activity 2021
URL https://mcn2021virtual.sched.com/event/lwrc/expanding-and-enriching-metadata-through-engagement-with...
 
Description Paper submission: Hunting for Treasure: Living with Machines and the British Library Newspaper Collection1 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Abstract: This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to make digitisation choices both transparent and pragmatic. Working towards solutions that reflect collaborations between library staff and scholars, we introduce: a) Press Picker, our custom visualisation tool designed to support decision making about digitisation; and b) the Environmental Scan, a process of automatic metadata generation from the Newspaper Press Directories, a contemporaneous record of British newspapers.
Year(s) Of Engagement Activity 2020
 
Description Participation in the "Computational Approaches for Digitized Historical Newspapers" Dagstuhl Seminar 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 20 international researchers were invited to participate in the "Computational Approaches for Digitized Historical Newspapers (22292)" Dagstuhl Seminar, among which two members of the Living with Machines project: Kaspar Beelen and Mariona Coll Ardanuy. Dagstuhl research seminars focus on the exchange and development of ideas on current topics in computer science. In this particular edition, the focus was on analysing successes and limitations of current computational approaches to historical newspapers, and discuss future challenges, potential solutions and common strategies. The outcomes of the discussions and findings were published in a report.
Year(s) Of Engagement Activity 2022
URL https://www.dagstuhl.de/en/seminars/seminar-calendar/seminar-details/22292
 
Description Plenary talk at Conference on interdisciplinary and transdisciplinary research for sustainable development (UCLouvain, Belgium) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact The talk sparked many questions, both during and after the event. Several people were interested in knowing more about the activity and noted how it provided them with a completely fresh perspective on how issues in humanities can be investigated. I also received proposals for further engagement by multiple people, including the organisers of the event.
Year(s) Of Engagement Activity 2022
URL https://uclouvain.be/en/discover/university-transition/conference-sur-la-recherche-interdisciplinair...
 
Description Podcast interview: Crowdsourcing with Dr Mia Ridge, MadeTech Making Tech Better podcast 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact What is crowdsourcing, and how is it used to improve the British Library's online cultural heritage collections? Clare Sudbery talks to crowdsourcing expert Dr Mia Ridge about the power of volunteer digital engagement.
Year(s) Of Engagement Activity 2021
URL https://www.madetech.com/resources/podcasts/episode-14-mia-ridge-2/
 
Description Poster submission: Data for History in Berlin 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster submission for conference Data for History in Berlin (May 2020) approved.
Due to the covid-19 the conference has been postponed till May 2021.
Year(s) Of Engagement Activity 2020
URL https://d4h2020.sciencesconf.org/
 
Description Presentation 'Historic Census Data and Living with Machines' to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Presentation on 'Historic Census Data and Living with Machines' delivered by Josh Rhodes and Guy Solomon to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021). The presentation gave genealogical professionals, family historians, and other members of the public an insight into how Living with Machines is using historic census data. In particular, we focused on our use of open census data, which is in line with the Free UK Genealogy's mission to provide free, online access to historic British census data. The presentation was delivered to an audience of c. 100 on Zoom, and has since received > 200 views on YouTube. Presenting at this conference enabled the Living with Machines project to establish a closer relationship with Free UK Genealogy, and to begin conversations about sharing data. The presentation also engaged members of the public, who expressed interest in our use of census data, and changed people's minds about what was possible to achieve at scale with historic census data.
Year(s) Of Engagement Activity 2021
URL https://youtu.be/EY7mwn_sHHU?t=716
 
Description Presentation at CogSci seminar at QMUL 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Talk at the Cognitive Science group at Queen Mary University of London, presenting research on "Animate Machines: A study on atypical animacy detection".
Year(s) Of Engagement Activity 2020
URL http://imc.eecs.qmul.ac.uk/wiki/index.php/Abstract_Mariona_Coll_Ardanuy_25_March_2020
 
Description Presentation at the ESPRit Online Seminar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Kaspar gave a presentation for the European Society for Periodical Research (ESPRit) Online Seminar on 20 January 2023. The theme of the seminar was:
"New Computational Approaches to Periodical Studies". The title of the presentation was: "Mining Victorian Metadata. A computational analysis of historical press directories"
Year(s) Of Engagement Activity 2023
URL https://www.espr-it.eu/news/events/167-esprit-seminar-20-january-2023
 
Description Presentation at the KBR Digital Heritage Seminar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presented "Assessing Biases in Digitized Newspaper Collections" at the Digital Heritage Seminar organized by the Royal Library of Belgium, (May 25, 2023)
Year(s) Of Engagement Activity 2023
 
Description Presentation by Daniel Wilson and Ruth Ahnert at Text Mining Parliamentary Data Seminar, University of Umea, Sweden. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact "Tracing the language of machines across genres: books, journals and newspapers", Academic Presentation by Daniel Wilson and Ruth Ahnert to High Profile International Seminar featuring luminaries of the field such as Mark Algee-Hewit and chaired/respondent by Prof. Jo Guldi. Much interest generated in our method.
Year(s) Of Engagement Activity 2021
URL https://www.umu.se/en/events/comparing-parliaments-novels-and-newspapers_10768814/
 
Description Presentation for the C2DH group in Luxembourg 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation for the C2DH group in Luxembourg. The presentation was part of the "Hands-on History" lectures.
Year(s) Of Engagement Activity 2021
URL https://www.c2dh.uni.lu/events/living-machines-digital-perspectives-industrial-revolution
 
Description Presentation for the History Department at the University of Antwerp 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation on digital methods for history for the History Department at the University of Antwerp.
Year(s) Of Engagement Activity 2021
 
Description Presentation for the Parliamentary Data Seminar (14/10/2021) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation on the Targeted Sense Disambiguation during the Parliamentary Data Seminar on the topic "What's really going on".
Year(s) Of Engagement Activity 2021
URL https://www.umu.se/en/events/text-mining-parliamentary-data-seminar-what-is-really-going-on-_1084277...