Living with Machines

Lead Research Organisation: The Alan Turing Institute
Department Name: Research

Abstract

Living with Machines is both a research project, and a bold proposal for a new research paradigm. In this ground-breaking partnership between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL), historians, data scientists, geographers, computational linguists, and curators have been brought together to examine the human impact of industrial revolution.

It is widely recognised that Britain was the birthplace of the world's first industrial revolution, yet there is still much to learn about the human, social, and cultural consequences of this historical moment. Focussing on the long nineteenth century (c.1780-1920), the Living with Machines project aims to harness the combined power of massive digitised archives and computational analytical tools to examine the ways in which technology altered the very fabric of human existence on a hitherto unprecedented scale. The central theme - the mechanisation of work practices - speaks directly to present debates about how society can accommodate the revolutionary consequences of AI and robotics in what has become known as the fourth industrial revolution. To understand the fraught co-existence of human and machine, this project contends that we need research methods that combine technological innovation and human expertise.

The project will utilise the British Library's National Newspaper collection, and event-based records (census, electoral registration, births/ marriages/deaths, trade directories) collected by contributing partners Findmypast. By developing intuitive computational interfaces, and adapting collaborative practices developed in the field of software development, we will enable close interaction between computational methods and historical inquiry.

Outreach and Engagement will be central to the project from the outset, and will take two forms: familiar outcomes such as television programmes and regional exhibitions; and working with individuals and communities to create common understandings of their shared histories. Participatory aspects will embody best practices in crowdsourcing and citizen history.

Project benefits:

1. The UK's first large-scale synergy between data science, artificial intelligence research, and the arts and humanities, building capacity and catalysing new research areas.

2. The development of new computational techniques to marshal the UK's rich archival collections (digitised and born-digital), to enable new research questions to be posed of the holdings.

3. Enriched and interlinked data holdings for the British Library, to add additional context and value to content.

4. The development generalisable tools, code, and infrastructure that can be adapted for and inspire future interdisciplinary research projects.

5. New historical perspectives on the effects of the mechanisation of labour on the lives of ordinary people during the long nineteenth century.

6. The creation of computational models to represent how language and meanings change across time and geography.

7. Research breakthroughs maintaining UK global leadership in Digital Humanities and driving large-scale international partnerships and opportunities.

Planned Impact

Optional.
 
Description Ruth Ahnert, fed into Forecasting Forum on the Future of Research, at thinktank Demos, December 5th 2019, London.
Geographic Reach National 
Policy Influence Type Participation in a advisory committee
URL https://demos.co.uk/wp-content/uploads/2019/10/Jisc-OCT-2019-2.pdf
 
Title Beavan, D., Jackson, M. Plain text and metadata extraction tool 
Description Tool for parallel processing of XML in METS/ALTO format for extraction of plain text and metadata fields, available in XSLT and Python versions. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This data wrangling tool facilitated downstream analysis of historical newspapers focussing on toponym resolution and OCR quality. It forms an essential part of the preprocessing pipeline that will be applied to new datasets whose acquisition is in progress. 
 
Title Beelen, K., Lexicon Expansion Interface 
Description Notebook for exploring word2vec models in order to build a lexicon that can trace certain topics in a collection. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact The Lexicon Expansion Interface allows users to navigate a vector space and expand a list of seed words into a Lexicon. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/lexicon-expansion/language-l...
 
Title Beelen, K., Lexicon Generator, a tool for generating contrastive lexicons using newspaper data 
Description Notebook for building a lexicon by contrasting two corpora using the Fightin' Words algorithm created by Monroe et al, 2008. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This notebook is an implementation of the Monroe et al algorithm "Fightin' Words". It is a feature extraction algorithm that computes which words are most significantly associated with with a specific subcorpus. This notebook helps us to "profile" certain types of language (e.g. contrast conservative to liberal newspapers) 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/lexi...
 
Title Beelen, K., Newspaper metadata database and search interface: scripts to build an ElasticSearch index and explore the data using Kibana 
Description Scripts to build an ElasticSearch index and explore the data using Kibana 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Newspaper metadata database and search interface. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/elast...
 
Title Beelen, K., Pipeline for processing the Newspaper Press Directories 
Description The series of notebooks includes a pipeline for processing the OCR (derived from the scans of Mitchell's Press Directories). The stages include: annotation, preprocessing, automatic tagging and database ingest. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This tool will be crucial for parsing and enriching implicitly structured data (such as the press directories, but also other historical sources). 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/ndp_p...
 
Title Coll Ardanuy, M., Hosseini, K., van Strien, D., McDonough, K., Wilson, D., Krause, A., underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources' 
Description Underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'. Resolving Places is one of the first outputs of Living with Machines, a collaborative digital history project at The Alan Turing Institute and the British Library. This research is part of our work to build a nineteenth-century gazetteer that combines place names derived from historical sources (GB1900) with online resources (Wikipedia and Geonames). GB1900 is the result of a crowdsourced project that transcribed all text labels on the 2nd edition 6-inch to 1 mile Ordnance Survey maps of Great Britain (ca. 1900) held by the National Library of Scotland (NLS Maps online). The Living with Machines gazetteer follows best practices in combining multiple existing resources, and is novel in accounting for places that have different scales (e.g. streets, buildings, cities, counties). In the future, we will be adding records and enriching current records with information from OS map 1st edition map label data and other sources. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact This work was presented at a workhsop on 27-28 November. Several attendants to the workshop showed interest in using the gazetteer produced through this code. Subsequent completed work and work in progress uses it, within and outside our project. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places/
 
Title Coll-Ardanuy, M., Code that builds a gazetteer from scratch 
Description Code and method to generate a gazetteer from Wikipedia and enriched with Geonames data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Part of larger workflow to create a geographical knowledge base that combines different 19thC knowledge sources together. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/gaze...
 
Title Coll-Ardanuy, M., Hosseini, K., Nanni, F., Toponym Matching (ongoing) 
Description This work looks for potential locations for each toponym identified in text, it addresses issue of high degree of variation in toponyms (due to regional spelling differences, transliterations strategies, cross-language and diachronic variation) and variations due to OCR errors. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? No  
Impact We are building a flexible deep learning framework for candidate selection through toponym matching, using various state-of-the-art neural network architectures, and assess its performance in different transfer learning settings. We will release several derived datasets, as well as the code. We expect this contribution to have a notable impact to the (geographic) information retrieval, natural language processing, and digital humanities disciplines. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/toponym-baseline/language-la...
 
Title Hobson, T., Tolfo, G. Methodological paper on Living with Machines' metamodel 
Description Data modelling methodology developed to underpin data infrastructure with the aim of promoting interoperability of tools and systems and accessibility of data and derived artefacts within the project and externally. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact The common data model developed by this method has been used in the design of relational database schemas and other research infrastructure to support interoperability across different source data types and varied research activities. 
URL https://www.overleaf.com/read/qjqqfdrqxkpr
 
Title Hosseini, K. and Vane, O. PressPicker code 
Description The PressPicker tool can be used to filter and visualise British Library holdings of undigitised newspapers as a function of time. It is also an interactive tool to pick newspaper titles (e.g. for digitisation). It consists of two Python Jupyter notebooks and a custom JavaScript interactive visualisation. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Successfully made two selections of newspaper titles for digitising within Living with Machines. 
 
Title Hosseini, K., Beelen, K., basic lexicon expansion algorithms using word embeddings 
Description In this notebook, we use the trained word embeddings (using word2vec or fasttext models) to explore the semantic space of our book and sample newspaper datasets. Several basic methods are implemented, e.g. explore the neighbouring words given a seed word (e.g., what are the most similar words to "machine" given our corpus?); visualisation of word vectors using t-SNE. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact This work is in progress. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language-lab-mro/lexi...
 
Title Hosseini, K., exploratory data analysis of GB1900 dataset 
Description A set of Jupyter-notebooks for visualisation and statistical analysis of GB1900 dataset. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These Jupyter-notebooks were developed to explore the GB1900 dataset, including visualisation of various entities (e.g., railway) on a map. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/space-time-mro/gb1900...
 
Title Hosseini, K., exploratory data analysis of newspaper/book databases 
Description A set of Jupyter-notebooks to perform exploratory data analysis on newspaper and book databases. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks were developed as teaching/research tools to: 1) show how to access a remote Postgres DB, query, plot the results. 1) exploratory data analysis (e.g., visualisation and simple statistical analysis) on the data. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/relational_database_e...
 
Title Hosseini, K., from raw data to language-models/word-embeddings 
Description These notebooks combined form a pipeline in which raw book/newspaper textual data can be accessed, preprocessed and then used to generate word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks (and their Python-script version) have been extensively used to generate word2vec, fasttext, Flair and BERT language models. These models are being used in several NLP-related projects. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/noteb...
 
Title Hosseini, K., intrinsic evaluation of word embeddings / language models 
Description The performance of any trained machine learning model needs to be evaluated (intrinsically or extrinsically) before being used. Here, we collected several datasets and developed a set of codes to evaluate trained word embeddings and language models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Evaluation of all word-embeddings/language models being used in the project. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Hosseini, K., parallel processing of book (and newspaper) dataset using MPI (Message Passing Interface) 
Description As we are dealing with a large textual data (e.g., our book dataset contains 4.5B words), we started to experiment with different distributed and parallel algorithms to preprocess and to train machine learning models. Here, we used MPI (Message Passing Interface) through Python. This code distributes the job among the requested number of CPUs (workers) which can be on different nodes in a supercomputer (i.e. not limited to shared-memory machines); therefore, it significantly reduces the wall time. This code was tested on Urika. Unfortunately, Urika is not available anymore, and now, we are exclusively using Azure virtual machines (VM). These VMs are shared-memory, so we switched to simpler parallel-processing algorithms. However, the MPI algorithm and tools developed here should be usable later when we have access to even larger datasets. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Preprocess and extract information (e.g., part-of-speech tagging) from large textual datasets. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/mpi_v...
 
Title Hosseini, K., record linkage using various multi-class classifiers and manual annotations 
Description Record linkage across two noisy datasets (for example, historical texts) is a non-trivial task. In this tool, we experimented with different multi-class classifiers, e.g. decision tree and multilayer perceptron architectures. We also assessed the impact of features (e.g., title, date and place of publication) on the statistical performance of these models. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Creating a list of linked entities between NPD (newspaper press directory) and British Library titles list. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/linki...
 
Title Hosseini, K., upload images to Zooniverse 
Description ~10,000 images from the digitised newspaper articles were selected and uploaded to Zooniverse for annotation. Defoe, a spark-based toolbox for analysing digital historical textual data, was used to select the images. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact The human/expert annotation is one of the main ingredients in training and evaluating supervised machine learning methods. The results of this experiment can be used in various tasks, e.g., sentence/document classification. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/communities-mro/zooni...
 
Title Vane, O. OS maps metadata visualisation code 
Description Custom visualisation of digitised 19th Century Ordnance Survey maps (from National Library of Scotland) to investigate patterns of map revision through time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Used tool to create supporting material for BL map digitisation proposal and to help identify suitable locations for historical case studies (factors include OS map coverage). 
 
Title Vane, O., Code for filtering Kings Topographical map collection metadata 
Description Python Jupyter notebook for filtering British Library KTop metadata by geography and time. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Identifying relevant digitised material for Living with Machines research. 
 
Title Vane, O., Code underlying a blogpost about how to put a D3 JavaScript visualisation in a Python Jupyter notebook. 
Description Jupyter notebook demonstrating how to use JavaScript and the D3 visualisation library in a Python Jupyter notebook. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact Email from a blog reader describing it as very helpful. 
URL https://github.com/alan-turing-institute/D3_JS_viz_in_a_Python_Jupyter_notebook
 
Title Vane, O., Strabo output visualisation code 
Description Visualising the output of 'Strabo' tool (software tool to auto-transcribe text in historical maps by researchers at the University of Southern California Spatial Informatics Laboratory). 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Non statistical evaluation of Strabo tool success with our map data. 
 
Title van Strien, D., Beelen, K., Coll Ardanuy, M., Hosseini, K., McGillivray, B., Colavizza, G., underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks' 
Description These notebooks contain the underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'. The code runs experiments reported in the paper and generates the figures used in the paper. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact This code helps the project better understand issues relating to OCR technology and will inform research methods for our projects and other projects working with text produced through OCR. 
URL https://github.com/alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks
 
Title van Strien, D., Beelen, K., McDonough, K. 4 Jupyter notebooks on basic computer vision methods for historic OS maps 
Description These notebooks provide an explanation on using computer vision methods with historic maps. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used in two workshops with >40 participants. They will be developed further into a series of tutorials. 
 
Title van Strien, D., Beelen, K., McDonough, K. 5 Jupyter notebooks on using Deep-learning methods for computer vision on historic OS maps 
Description Additional notebooks on using computer vision methods with historic digitised map collections. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact These notebooks have been used as teaching materials in two workshops and will be developed further into publicly available tutorials. 
 
Title van Strien, D., Prototype Maps annotation pipeline 
Description A prototype method for collecting annotations from researchers, running classification and analysing historic maps at scale. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? No  
Impact These methods have been used as an initial prototype which is currently being developed further inside the project. 
 
Title Book and newspaper databases 
Description This database consists of ~49K books (metadata and full-text, 4.5B words) and 11.8M newspaper pages (only metadata). We used "Azure Database for PostgreSQL" service to manage this database.Various codes/jupyter-notebooks are developed to access this database and perform exploratory data analysis. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact This database has been used in various text mining and natural language processing tasks, such as: 1) Generating language models including word2vec, fasttext, Flair and BERT type models. The book database was mainly used here as it has a large number of books suitable for training stable language models; however, we also trained several models using a sample from newspaper articles. 2) Pre-trained models used in "Assessing the Impact of OCR Quality on Downstream NLP Tasks" paper. 3) Developing the processing pipeline. 
 
Title G. Tolfo, O. Vane, K. McDonough, Metadata for BL map collections including the King's Topographical Map collection and the Goad Fire Insurance Maps. 
Description Csv exports of British Library records for King's Topographical collection maps and Goad Fire Insurance maps. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact We are in the process of making this data interoperable with our other map metadata from the National Library oof Scotland, at which point we will release it to the public so that it is a tool for improving discovery of digitised map content in British heritage institutions. 
 
Title K. McDonough, O. Vane, A. Krause, C. Fleet, Metadata from National Library of Scotland Ordnance Survey map collections 
Description Shapefile metadata for Ordnance Survey map sheets received from Chris Fleet at the National Library of Scotland for analysis alongside digitised images oof NLS Ordnance Survey maps. We exported this data and built a relational database to make the data more accessible. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact By re-formatting this data and linking it to additional metadata, we are enabling a better understanding of a) where there are concentrations or gaps in the digital record and b) how revision practices varied over British space. 
 
Title Kasra Hosseini, language model zoo 
Description Collection of trained word embeddings and language models, mainly by using the book database. Various model types are trained and added to the collection, e.g., word2vec, fasttext, contextual string embeddings (Flair), BERT. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact Language models and word-embeddings are one of the main ingredients in many NLP-related tasks in this project. Here, we keep track of the trained models, so researchers can easily find the models and use them for their research. 
URL https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb...
 
Title Mariona Coll-Ardanuy - Creation of toponym resolution datasets (ongoing). 
Description Creation of toponym resolution datasets: ~1000 newspaper articles manually annotated with mentions of places and their geographical coordinates. The annotations are not yet complete. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? No  
Impact Ongoing work. We aim at publishing the dataset as soon as the annotations are complete. They will serve to assess the performance of our toponym resolution method and will be a contribution to several fields, like geographic information retrieval, computational linguistics, and digital humanities. 
 
Title Mariona Coll-Ardanuy, Creation of a gazetteer for toponym resolution (ongoing). 
Description Creation of a gazetteer for toponym resolution (alpha version). This is a Wikipedia-based gazetteer, enriched with data from the geographical database Geonames. The alpha version of the code that creates the gazetteer has already been released (see URL below). This work is ongoing: we are working on enriching it with data from historical sources (maps and text). 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The gazetteer has not been made available, but publication and the code repository with the instructions on how to create it are publicly available. 
URL https://github.com/alan-turing-institute/lwm_GIR19_resolving_places
 
Title Newspaper Directories digitised, OCRed, modelled and structured data extracted from Mitchell's directories (1846-1909) 
Description This collection includes a subset of Mitchel's Newspaper Press Directories which is annotated and structured for future incorporation in the Newspaper database. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact The information extracted from the Press Directories will significantly contribute to enriching newspaper data received from Heritage Made Digital, FindMyPast and JISC. It will also contribute to the environmental scan project and paper. 
 
Description Humphrey Southall (Vision of Britain) 
Organisation University of Southampton
Country United Kingdom 
Sector Academic/University 
PI Contribution Reuse of data and citation.
Collaborator Contribution Data sets shared in addition to those available for download on the Vision of Britain site, including a simplified data table.
Impact Data sharing.
Start Year 2019
 
Description Living with Machines and Find My Past 
Organisation Findmypast
Country United Kingdom 
Sector Private 
PI Contribution We will be sharing the methods and outcomes of our research on this data, for example OCR correction, and toponym resolution.
Collaborator Contribution FMP has shared newspaper data with Living with Machines for two counties (Lancashire and Dorset), and in the near future will be sharing all newspapers from Britain dating 1780-1920 that were digitised by FMP for the British Newspaper Archive. A member of FMP also sits on Living with Machines' Advisory Board.
Impact Findmypast has provided samples of the British Library's digitised Newspaper Collection and have advised us through their membership on Living with Machines Advisory Board. There are prospects of working together on OCR correction following the ingestion of other incoming full data-sets from the same collection.
Start Year 2018
 
Description Living with Machines and National Library of Scotland 
Organisation National Library of Scotland
Country United Kingdom 
Sector Academic/University 
PI Contribution Living with Machines initiated contact with Chris Fleet, map curator at the NLS to investigate access to their digitized map collections. K. McDonough and O.Vane have worked closely with Fleet over the last 9 months to share and evaluate the digital map holdings. We organized a workshop (June 2019) at the Turing/BL with Chris and other historical maps experts to explore best practices in working with large collections for a digital humanities project. We have shared back reflections and code for enriching the collection metadata, visualizing the collections, and have also developed a close working relationship that will continue to grow (through the sharing of additional maps and metadata as well as collaborative research into other ways of sharing digital map data to researchers through IIIF).
Collaborator Contribution NLS Maps curator Chris Fleet has shared a subset of the 200,000 digitized sheets, to be expanded on in the near future. He has provided extensive advice and support for working with the metadata, accessing versions of the maps as web map tiles, and thinking about the next steps of using these materials in a computational research environment. He has also been immensely helpful in connecting Living with Machines to the small, but growing community of researchers using machine learning methods with maps.
Impact Blog posts (Computational Approaches to Ordnance Survey Maps: Finding words in maps, part 2: seeing the results blog post); Talks (Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020; Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020, Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019; Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019; Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019); Workshops (Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020; and Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 ); and Meetings (Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019).
Start Year 2019
 
Title defoe, the spark-based for analysing historical datasets 
Description This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations. The proposed tool has been successfully evaluated using five different large-scale historical text datasets and two HPC environments, as well as on desktops. Results shows that defoe allows researchers to query multiple datasets in parallel from a single command-line interface and in a consistent way, without any HPC environment-specific requirements. 
Type Of Technology Software 
Year Produced 2019 
Impact Originally developed by UCL and the British Library (funded by Jisc, 2015) then UCL (funded by 2016-2018), defoe was refactored and extended by EPCC, The University of Edinburgh for both Alan Turing Institute funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020) 
URL https://github.com/alan-turing-institute/defoe
 
Title defoe_visualization, a collection of notebooks for analysing further the results obtained by defoe 
Description defoe_visualization is a repository of Jupyter notebooks which complements the defoe scalable and portable digital eScience toolbox for historical research. These notebooks allow researchers to explore query results from defoe and to post-process the results to reveal new insights into the historical data processed by defoe. The notebooks are complemented with sample data files with the query results produced by the authors. 
Type Of Technology Software 
Year Produced 2019 
Impact Developed by EPCC, The University of Edinburgh in conjunction with: the Alan Turing Institute (2018-2019) funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020). 
URL https://github.com/alan-turing-institute/defoe_visualization
 
Description "How we collaborate" blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blog post series reflecting on our experience of collaborating on the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/how-we-collaborate/
 
Description "Introducing the Language Lab" blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Blogpost introducing the language lab, which explored the social and cultural impact of the Industrial Revolution as reported in newspapers and other types of textual sources.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-language-lab/
 
Description "Introducing..." blog post series 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact We published a series of blog posts introducing each member of the Living with Machines team
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/category/the-team/
 
Description Andre Piza presented at "Future of Journalism" to Open Society Foundation Journalism Programme 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation about Living with Machines project started dialogue with BBC News Labs and Open Society leading to talk from BBC News Labs Executive Product Manager (David CAswell) at the Alan Turing Institute and visit from Open Society's Independent Journalism Senior Programme Specialist (Shuwei Fang). Opportunities for collaboration with LWM are now being explored with BBC News Labs.
Year(s) Of Engagement Activity 2019
 
Description Annotation session with the British Library staff, 2 August 2019, organised by Daniel van Strien, Mariona Coll Ardanuy, and Mia Ridge 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact We had an open annotation session in which we invited British Library staff members to help with our experiments. We planned four different linguistic annotation tasks (named entity recognition, recognition of machines, entity linking to Wikipedia, and semantic role labeling) on newspaper articles from the nineteenth century.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Blog Post on Sources Lab (Understanding the Victorian Newspaper Landscape) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Blog post describing the work of the Source Lab on Digitizing and processing the Newspaper Press Directories.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/sources-understanding-the-victorian-newspaper-landscape/
 
Description British Library Open House Session at Boston Spa 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Open House Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library
Year(s) Of Engagement Activity 2020
 
Description British Library Show and Tell Session at King's Cross St. Pancras 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presentation about the various tasks and outcomes of the Projects Labs
Year(s) Of Engagement Activity 2019
 
Description Cambridge GLAM Digital champions lightning talk "The Living with machines project" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact I presented the Living with machines project to an audience of librarians and other professionals from the GLAM (Galleries, Libraries, Archives, Museums) sector.
Year(s) Of Engagement Activity 2020
URL https://www.eventbrite.co.uk/e/glam-digital-champions-digital-lunch-january-2020-tickets-89946158381...
 
Description Code and Coffee ?? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post describing an internal project activity aimed at facilitating collaboration
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/code-and-coffee/
 
Description Collecting annotations from British Library staff 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an event held with British Library staff
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/
 
Description Computational Approaches to Ordnance Survey Maps blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This blog post introduces the preliminary work of the "Space & Time Lab" in Living with Machines, which experimented with computer vision methods for studying large sets of historical, digitized maps. With 179 page views, it generated several conversations with external researchers about our use of these methods in the humanities context.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/introducing-the-space-and-time-lab/
 
Description D3 JavaScript visualisation in a Python Jupyter notebook 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post describing how to combine JavaScript, the visualisation library D3.js and Python Jupyter notebooks. Accompanying notebook code was published with this blogpost.
Year(s) Of Engagement Activity 2020
URL https://livingwithmachines.ac.uk/d3-javascript-visualisation-in-a-python-jupyter-notebook/
 
Description Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact A workshop held for British Library staff on using Computer Vision methods with heritage data including historic map collections.
Year(s) Of Engagement Activity 2020
 
Description Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop on using Computer Vision methods with historical collections held at the Create centre in Amsterdam University.
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/events/digital-history-workshop-maps-as-data-analysing-historic...
 
Description Daniel van Strien, Katherine McDonough, Daniel Wilson presented at Victorian Data Conference, University of Virginia, November 15-16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Three Living with Machines members presented on a session about "Living with Bias" at the Victorian Data conference, the first gathering of nineteenth-century studies scholars using digital methods in their work. Attended by about 100 researchers, our presentation both introduced Living with Machines to this largely US-based audience and generated several connections which have already resulted in visits to the Turing/BL in London in 2020 (including the faculty director of the University of Virginia Scholar's Lab, Alison Booth, who was a co-host of this conference).
Year(s) Of Engagement Activity 2019
URL http://data-caucus.herokuapp.com/conference-cfp
 
Description David Beavan and James Hetherington contributing to Royal Society 'Dynamics of data science skills' 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Contribution to report - see link.
Year(s) Of Engagement Activity 2019
URL https://royalsociety.org/topics-policy/projects/dynamics-of-data-science/
 
Description David Beavan invited 'floating expert' and Mia Ridge, Dr. Katherine McDonough, Dr. Kaspar Beelen and Dr. Kasra Hosseini (project collaborator) invited participants at Computational Archival Science Workshop: Exploring Data, Investigating Methodologies, The National Archives, 20-21 June 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact About 100 people attended this event where Kaspar Beelen and Katie McDonough presented the keynote lecture on bias in digitized archival collections being used in the Living with Machines project. The international audience included GLAM professions and students from the US, UK, and elsewhere in Europe, and fostered conversations about the role of GLAM institutions in collaborating with researchers to develop best practices for creating, preserving, and making accessible digitised and born digital collections.
Year(s) Of Engagement Activity 2020
URL https://blog.nationalarchives.gov.uk/computational-archival-science-cas-exploring-data-investigating...
 
Description David Beavan invited presentation at Software Development in Digital Humanities Labs and Projects, University of Sussex, 30 July 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description David Beavan led, Mia Ridge, Barbara McGillivray participated in panel discussion 'Data Science & Digital Humanities: new collaborations, new opportunities and new complexities' at Digital Humanities 2019 conference, Utrecht, July 11, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This panel highlights the emerging collaborations and opportunities between the fields of Digital Humanities (DH), Data Science (DS) and Artificial Intelligence (AI). It charts the enthusiastic progress of the Alan Turing Institute, the UK national institute for data science and artificial intelligence, as it engages with cultural heritage institutions and academics from arts, humanities and social sciences disciplines. We discuss the exciting work and learnings from various new activities, across a number of high-profile institutions. As these initiatives push the intellectual and computational boundaries, the panel considers both the gains, benefits, and complexities encountered. The panel latterly turns towards the future of such interdisciplinary working, considering how DS & DH collaborations can grow, with a view towards a manifesto. As Data Science grows globally, this panel session will stimulate new discussion and direction, to help ensure the fields grow together and arts & humanities remain a strong focus of DS & AI. Also so DH methods and practices continue to benefit from new developments in DS which will enable future research avenues and questions.
Year(s) Of Engagement Activity 2019
URL https://dev.clariah.nl/files/dh2019/boa/0364.html
 
Description David Beavan presented at Turing Innovation Symposium, hosted by Accenture, Dublin, 3-4 April 2019. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview of Living with Machines for Turing Innovation Showcase in Dublin 2019.
Year(s) Of Engagement Activity 2019
 
Description David Beavan presented talk 'Potential Uses of a Registry of Digitised Works: By scholars' at Global Digitised Dataset Network, British Library, 10 June 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Lessons from the project on uses of a registry of digitised works
Year(s) Of Engagement Activity 2019
URL https://gddnetwork.arts.gla.ac.uk/
 
Description Deep learning reading group 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing an internal reading group on deep-learning methods being used by the project.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/deep-learning-reading-group/
 
Description Developing Data Study Group with TNA on (web) archives and social attitudes towards new technologies, initiated by Barbara McGillivray and David Beavan 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Data Study Groups are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Beavan and McGillivray co-organised a DSG with the National Archives on "Discovering topics and trends in the UK Government Web Archive"
Year(s) Of Engagement Activity 2019
URL https://www.turing.ac.uk/events/data-study-group-december-2019
 
Description Did Machines Drive History? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post introducing the first minimum research outcome of the language lab, in which we explored to what extent machines were being seen as agents able to drive change.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/did-machines-drive-history/
 
Description Emma Griffin invited presentation: International symposium - 'Dartmouth and the World', Dartmouth University, 10-20 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Gave talk on "Life and Living Standards in Britain's Industrial Revolution"
Year(s) Of Engagement Activity 2019
 
Description Emma Griffin invited presentation: Oregon State University, Centre for the Humanities, 7 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk on "Home Economics: Food, Money, and Emotions in Victorian Britain"
Year(s) Of Engagement Activity 2019
 
Description Finding words in maps, part 2: seeing the results 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post about evaluating the 'Strabo' tool (software for transcribing text in digitised historical maps) on our map data through visualisation.
Year(s) Of Engagement Activity 2019
URL https://livingwithmachines.ac.uk/finding-words-in-maps-part-2-seeing-the-results/
 
Description Introduction to Python, with Mariona Coll Ardanuy, July 19th 2019, organised by Mariona Coll Ardanuy for Turing Community 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact 4-hour introductory course to programming for the Humanities, with a focus to text processing and data wrangling (e.g. opening and working with documents and file paths). The feedback was very positive. Participants got acquainted with the basics of Python programming, which they have been able to apply to the project in multiple occasions.
Year(s) Of Engagement Activity 2019
 
Description Jon Lawrence, Inter-Disciplinary Research Programme Assessor for British Academy - 'The Humanities and Social Sciences Tackling the UK's International Challenges' (2019) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Assessing projects under the heading " The Humanities and Social Sciences Tackling the UK's International Challenges"
Year(s) Of Engagement Activity 2019
URL https://www.thebritishacademy.ac.uk/programmes/tackling-uk-international-challenges
 
Description Kaspar Beelen "Surveying the Newspaper Landscape" (CREATE Salon, February) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at the University of Amsterdam attended by ca. 20 people. It was part of the "Salon" series organized by CREATE Amsterdam (Julia Noordegraaf).
Year(s) Of Engagement Activity 2020
URL https://www.create.humanities.uva.nl/
 
Description Kaspar Beelen Presentation for the British Library News Collection Group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presentation on the digitization of the Newspaper Press Directories and how this feeds into understanding the shape and contours of digital newspaper collections.
Year(s) Of Engagement Activity 2020
 
Description Kaspar Beelen and Katherine McDonough Keynote presentation the Computational Archival Science symposium "Surveying the Land" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Keynote presentation by Kaspar Beelen and Katherine McDonough at the Computational Archival Science Symposium, organized at the Alan Turing Insitute (January 2020).
Year(s) Of Engagement Activity 2020
URL https://www.turing.ac.uk/events/computational-archival-science-cas-symposium
 
Description Kaspar Beelen, Invited talk "Stereotypes in Newspaper data" at the Dutch National Library Research Week, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Presentation at the Dutch Royal Library (KB) to report on the progress of my Research in Residence programme. It was part of the KB "Research Week" and was the most popular in terms of people signing up.
Year(s) Of Engagement Activity 2019
 
Description Kaspar Beelen, Panel discussion on Coding Literacy in the Digital Humanities, at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Participation in a round table on the topic of "Coding Literacy in the Humanities" (organized by Marijn Koolen, Liliana Melgar and Mari Wigham). The round table included a presentation with different experts (Joris van Zundert, Elli Bleeker, Sally Chambers) and discussion with an audience of Digital Humanities experts.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2020/01/DH_Benelux_2019_paper_25.pdf
 
Description Kaspar Beelen, Presentation on "Bias in the British Newspaper Archive" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact 15 minutes paper presentation on the work the emerged out of the Sources Lab, focussed on understanding the newspaper landscape.. Attended by ca. 25 people, from various backgrounds (DH researchers, librarians,)
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/wp-content/uploads/sites/13/2019/08/DH_Benelux_2019_paper_33.pdf
 
Description Kaspar Beelen, Presentation on "The Agency of Machines" at Digital Humanities Benelux, September 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presentation reporting on the "The Agency of Machines" at the poster session of Digital Humanities Benelux, 2019. It involved discussion with many interested attendants of the conference.
Year(s) Of Engagement Activity 2019
URL http://2019.dhbenelux.org/program/
 
Description Kaspar Beelen, Seminar on History and Text, Antwerp University, November 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Undergraduate students
Results and Impact Presentation on the use of Text Mining for History. Part of the course "History and Language" (BA2) organised by Marnix Beyen (University of Antwerp).
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Presentation to about 40 people at the DH Seminar at Exeter was a great opportunity to make contact with the expert community there and introduce them to our ongoing work.
Year(s) Of Engagement Activity 2019
URL http://www.exeter.ac.uk/news/events/details/index.php?event=9637
 
Description Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Conversation to plan for future collaboration with researchers working at the cutting edge of computer vision for historical maps in the United States.
Year(s) Of Engagement Activity 2019
 
Description Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited talk to review applications of research on qualitative spatial relations in the Living with Machines project. Question session offered an opportunity to learn about related research in the UK and to share our ongoing work with leaders in the field.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/spatial-relationships-in-text-as-data-tickets-76259685773
 
Description Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About 60 people attended a presentation at Stanford University about Living with Machines. This conversation has created substantive links to the DH community at Stanford and there is continued interest in collaborating with us in the future.
Year(s) Of Engagement Activity 2019
URL https://cesta.stanford.edu/events/cesta-seminar-dr-katie-mcdonough
 
Description Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented Living with Machines research on computer vision with maps during a roundtable on applications of AI in GLAM institutions, generating conversation with an international audience about working with visual heritage materials at scale. The workshop offered GLAM staff, researchers, and policy leaders an opportunity for hands-on experience in computer vision, which has translated into invitations for collaboration and additional teaching opportunities.
Year(s) Of Engagement Activity 2019
URL https://fantasticfutures.stanford.edu/
 
Description Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Delivered a talk about using computer vision techniques to analyse digitised historical maps at scale.
Year(s) Of Engagement Activity 2020
 
Description Living with Machines OCR hack ? 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A blog post outlining an internal 'hack' event focused on OCR.
Year(s) Of Engagement Activity 2019
URL http://livingwithmachines.ac.uk/living-with-machines-ocr-hack/
 
Description Mariona Coll-Ardanuy, Presentation at CogSci seminar at QMUL (13/06/2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk at the Cognitive Science group at Queen Mary University of London, presenting preliminary research on the language lab work for Living with Machines. There were very relevant comments, and interesting questions as well. A subsequent talk at the Cognitive Science seminar was planned, which will take place on 25 May 2020.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge and Andre Piza, invited participants at AI and Storytelling workshop, Kings Digital Lab, Kings College London, Apr 1st 2019. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Researchers and industry reflected on on ways of collaborating in the field with particular attention to the challenges around engagement of Research Software Engineers, needed skills and project frameworks. Consolidated relationship between KDL and Living with Machines leading to a second meeting at the Turing with the KDL Director and 2 of their researchers with view of future collaboration.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge and Olivia Vane presented at KQ Codes Technical Socials at University College London: 'Research software engineering at one of the world's largest libraries', 20 February 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The Knowledge Quarter (KQ) Codes Technical Socials at UCL are informal events for anyone with an interest in the computational methods and technology behind research and innovation. They are an opportunity to get to know fellow practitioners, and to discuss and learn about useful tools and techniques which may help with your work.

We gave a presentation on research software engineering at the British Library, including a discussion of RSE roles on Living with Machines.
Year(s) Of Engagement Activity 2020
URL https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials
 
Description Mia Ridge initiated a meetup for scholars and institutions working with digitised newspapers for humanities research at Dh2019, Utrecht 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Group established to discuss the challenges and opportunities for scholars and institutions to collaborate using digitised newspaper collections
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge led panel discussion 'The Past, Present and Future of Digital Scholarship with Newspaper Collections' at Digital Humanities 2019 conference, Utrecht, July 10, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL http://www.openobjects.org.uk/2019/07/the-past-present-and-future-of-digital-scholarship-with-newspa...
 
Description Mia Ridge presented 'Living with "Living with Machines": navigating the digital shift at scale' paper accepted for DCDC, Discovering Collections, Discovering Communities, organised by The National Archives and Research Libraries UK (November 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk to cultural heritage audience at TNA about the project.
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project Living with Machines to Alberta Comer, Dean and University Librarian, University of Utah (May 2019) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge presented on the project at the Library of Congress's Digital Strategy Roundtable, Washington DC (June 2019) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, 'Machine Learning and Digital Humanities' panel, University of Newcastle, Newcastle, September 5, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact As machine learning becomes more common across a wide range of digital solutions, and increasingly factors in our daily lives, it is also being used more frequently in humanities research projects. The possibilities of machine learning need to be understood by humanities researchers and the complexities of the problems investigated in the humanities by those working with machine learning technologies. The humanities can offer a wealth of historical data that presents new challenges to machine learning methodologies: historical records, pictorial representations, literary (or other) text. Recent Digital Humanities projects already employ some machine learning technology, such as with the development of Handwritten Text Recognition (HTR), but the diversification of the data investigated with machine learning approaches has the potential to lead the technology in new and unexpected ways with real-world applications. Panel members include: • Beatrice Alex (University of Edinburgh), • Noura Al-Moubayed (Durham University), • Mia Ridge (British Library), • Melissa Terras (University of Edinburgh).
Year(s) Of Engagement Activity 2019
URL https://n8cir.org.uk/events/machine-learning-and-digital-humanities/
 
Description Mia Ridge, invited presentation, British Library Data Projects workshop, London, August 19, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Consortium of European Research Libraries (CERL) Annual Seminar 2019, Göttingen, October 9, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, KCL / British Library Research Collaboration workshop, Kings College London, September 27, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Museums + AI Network workshop, Pratt Institute, New York, September 16, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
URL https://www.openobjects.org.uk/2019/09/museums-ai-new-york-workshop-notes/
 
Description Mia Ridge, invited presentation, Princeton University Library, Princeton, September 13, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, invited presentation, Research Libraries UK International Symposium on Digital Scholarship, London, October 14, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This symposium explored the nature and extent of digital scholarship occurring within research libraries across the international research library community. It brought together representatives from international research library associations, funders, the academic community, and global-library collectives to discuss areas of potential cross-sector and interdisciplinary collaboration, and the routes and networks through which this might be achieved. Mia Ridge presented on "Building capacity for digital scholarship at a research library: Living with Machines, and the impact of data science"
Year(s) Of Engagement Activity 2019
URL https://www.rluk.ac.uk/digital-scholarship-and-the-role-of-the-research-library-symposium-slides/
 
Description Mia Ridge, invited presentation, Wellcome Library, London, July 4, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Overview presentation on Living with Machines project
Year(s) Of Engagement Activity 2019
 
Description Mia Ridge, presentation, Library of Congress Machine Learning Summit, Washington DC, September 20, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Ridge touched on three main kinds of challenges: scale, operational and interdisciplinary, and
copyright. A larger scale requires new worflows and quickly grows expensive, operationalizing
raises the question of producing public-facing infrastructure, and copyright involves negotiating
complex rights issues.
Year(s) Of Engagement Activity 2019
URL https://labs.loc.gov/static/labs/meta/ML-Event-Summary-Final-2020-02-13.pdf?loclr=blogsig
 
Description Olivia Vane, Katherine McDonough, Daniel van Strien, 21st Century Curator Talk (British Library staff talks), Maps and Machines: Using Computer Vision to Analyse the Geography of Industrialization (1780-1920), January 13, 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Talk onUsing Computer Vision to Analyse the Geography of Industrialization (1780-1920)
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert and David Beavan contribution to the review process of British Academy Digital Research in Humanities scheme 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A workshop here at the British Academy to learn about the progress of seven grants awarded through the Digital Research in Humanities scheme last year. The programme aims to extend support for researchers engaging with Digital Research in the Humanities by offering grants to carry out novel research through the application of new methods and tools to existing digital resources. The use and re-use of existing resources such as digital collections and datasets will demonstrate their capacity to generate new knowledge. The seven award-holders were invited to provide a brief presentation on the aims of their research, what is new about their approach to the topic, problems and challenges arising so far, scholarly benefits identified, the potential for scalability, and plans for the future. Ahnert and Beavan were asked to contribute to this review process which will discuss what commonalities, learning and good practice are emerging from the projects to date. The workshop was also attended by representatives from Jisc who partnered with the British Academy on the scheme and Fellows of the British Academy.
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert presented on the project to the Newseye project, at their project meeting hosted at the British Library (15 November 2018) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Third sector organisations
Results and Impact Report on Living with Machines to the Newseye project (which works on news, and so has important connections with our work)
Year(s) Of Engagement Activity 2018
 
Description Ruth Ahnert, 'Partial Histories from Partial Archives: Surveying the Land', Think-Play-Hack Summit 2019: Mythologies and World Views, SMU at Taos Campus, New Mexico, 3 July 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Think-Play-Hacks bring together researchers from across the United States and world to contemplate the frontiers of interdisciplinary data science. Imagine a group of humanists, social scientists, data scientists, engineers, literary scholars, and students in the mountains of Taos, New Mexico, moving through big ideas step-by-step, over a series of conceptual explorations, micro-tutorials, and mini-hackathons.

"Think-Play-Hack" Summits incubate interdisciplinary exchange through one day of presentations where we "think" together on critical perspectives from the humanities and social sciences. Next, participants "play" with the fit between critical questions, data, and algorithms through a series of lightning presentations and micro-seminars. Finally, participants "hack" on the concepts and questions they have heard, generating collaborative white-papers or code-driven approaches to the question. Student teams compete for the titles such as "best use of data" or "best visualization."

Think-Play-Hack was founded in 2018 by Mark Algee-Hewitt (Stanford University), Simon DeDeo (Carnegie Mellon University), James Evans (University of Chicago), Jo Guldi (Southern Methodist University), and Andrew Piper (McGill University).
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert, Roundtable contribution on "Critical Data Science: Mapping the Field' at workshop: "Data Science and Digital Cultural Heritage: facilitating new connections between the disciplines and professions that can transform the Global Data Context", 27th June 2019 at UCL. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This workshop was organized by Dr Julianne Nyhan and Dr Tessa Hauswedell and was made possible by the UCL Grand Challenges fund and the UCL Centre for Critical Heritage. In this workshop, we sought to investigate how large-scale digital cultural heritage archives are offering offered new ground to explore the potentials and hazards of data science. We reflected on what might be gained from the building of a 'critical data science' that can better account for the social and cultural contexts that shape digital cultural heritage production and its data-led research. In doing so, we opened a multidisciplinary and trans-professional dialogue on the future of data science and digital cultural heritage.
Year(s) Of Engagement Activity 2019
 
Description Ruth Ahnert, paper presented: 'Living with Machines', Institute for Applied Data Science Colloquium, QMUL, 20 March 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Project overview talk.
Year(s) Of Engagement Activity 2019
 
Description Semantic Corpus Exploration 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A half-day workshop on Semantic Corpus Exploration organized at the Digital Humanities Conference, Utrecht (2019). Participant learned how to (critically) use DBPedia Spotlight and WideNet for exploring and traversing historical corpora. The workshop was organized by Alex Olieman (University of Amsterdam), Jaap Kamps (University of Amsterdam) and Kaspar Beelen.
Year(s) Of Engagement Activity 2019
URL https://aolieman.github.io/semantic-corpus-exploration/
 
Description The impact of OCR on downstream Natural Language Processing tasks (tl;dr) 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Blog post summarising a research paper.
Year(s) Of Engagement Activity 2020
URL http://livingwithmachines.ac.uk/the-impact-of-ocr-on-downstream-natural-language-processing-tasks-tl...
 
Description Turing Data Science and Digital Humanities special interest group: uniting all 13 Turing universities plus in GLAM sector. Co-I Barbara McGillivray leading on Manifesto for community building, organised by Co-I David Beavan and Co-I Barbara McGillivray 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Please add - compulsory
Year(s) Of Engagement Activity 2019