Living with Machines
Lead Research Organisation:
The Alan Turing Institute
Department Name: Research
Abstract
Living with Machines is both a research project, and a bold proposal for a new research paradigm. In this ground-breaking partnership between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL), historians, data scientists, geographers, computational linguists, and curators have been brought together to examine the human impact of industrial revolution.
It is widely recognised that Britain was the birthplace of the world's first industrial revolution, yet there is still much to learn about the human, social, and cultural consequences of this historical moment. Focussing on the long nineteenth century (c.1780-1920), the Living with Machines project aims to harness the combined power of massive digitised archives and computational analytical tools to examine the ways in which technology altered the very fabric of human existence on a hitherto unprecedented scale. The central theme - the mechanisation of work practices - speaks directly to present debates about how society can accommodate the revolutionary consequences of AI and robotics in what has become known as the fourth industrial revolution. To understand the fraught co-existence of human and machine, this project contends that we need research methods that combine technological innovation and human expertise.
The project will utilise the British Library's National Newspaper collection, and event-based records (census, electoral registration, births/ marriages/deaths, trade directories) collected by contributing partners Findmypast. By developing intuitive computational interfaces, and adapting collaborative practices developed in the field of software development, we will enable close interaction between computational methods and historical inquiry.
Outreach and Engagement will be central to the project from the outset, and will take two forms: familiar outcomes such as television programmes and regional exhibitions; and working with individuals and communities to create common understandings of their shared histories. Participatory aspects will embody best practices in crowdsourcing and citizen history.
Project benefits:
1. The UK's first large-scale synergy between data science, artificial intelligence research, and the arts and humanities, building capacity and catalysing new research areas.
2. The development of new computational techniques to marshal the UK's rich archival collections (digitised and born-digital), to enable new research questions to be posed of the holdings.
3. Enriched and interlinked data holdings for the British Library, to add additional context and value to content.
4. The development generalisable tools, code, and infrastructure that can be adapted for and inspire future interdisciplinary research projects.
5. New historical perspectives on the effects of the mechanisation of labour on the lives of ordinary people during the long nineteenth century.
6. The creation of computational models to represent how language and meanings change across time and geography.
7. Research breakthroughs maintaining UK global leadership in Digital Humanities and driving large-scale international partnerships and opportunities.
It is widely recognised that Britain was the birthplace of the world's first industrial revolution, yet there is still much to learn about the human, social, and cultural consequences of this historical moment. Focussing on the long nineteenth century (c.1780-1920), the Living with Machines project aims to harness the combined power of massive digitised archives and computational analytical tools to examine the ways in which technology altered the very fabric of human existence on a hitherto unprecedented scale. The central theme - the mechanisation of work practices - speaks directly to present debates about how society can accommodate the revolutionary consequences of AI and robotics in what has become known as the fourth industrial revolution. To understand the fraught co-existence of human and machine, this project contends that we need research methods that combine technological innovation and human expertise.
The project will utilise the British Library's National Newspaper collection, and event-based records (census, electoral registration, births/ marriages/deaths, trade directories) collected by contributing partners Findmypast. By developing intuitive computational interfaces, and adapting collaborative practices developed in the field of software development, we will enable close interaction between computational methods and historical inquiry.
Outreach and Engagement will be central to the project from the outset, and will take two forms: familiar outcomes such as television programmes and regional exhibitions; and working with individuals and communities to create common understandings of their shared histories. Participatory aspects will embody best practices in crowdsourcing and citizen history.
Project benefits:
1. The UK's first large-scale synergy between data science, artificial intelligence research, and the arts and humanities, building capacity and catalysing new research areas.
2. The development of new computational techniques to marshal the UK's rich archival collections (digitised and born-digital), to enable new research questions to be posed of the holdings.
3. Enriched and interlinked data holdings for the British Library, to add additional context and value to content.
4. The development generalisable tools, code, and infrastructure that can be adapted for and inspire future interdisciplinary research projects.
5. New historical perspectives on the effects of the mechanisation of labour on the lives of ordinary people during the long nineteenth century.
6. The creation of computational models to represent how language and meanings change across time and geography.
7. Research breakthroughs maintaining UK global leadership in Digital Humanities and driving large-scale international partnerships and opportunities.
Planned Impact
Optional.
Publications
Ahnert R
(2021)
Living with Machines Delivery Plan version 1, 2019
Ahnert R
(2023)
Living with Machines Final Report
Andrew Darby
(2022)
AI training resources for GLAM: a snapshot
Angelina Mcmillan-Major
(2022)
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Ardanuy M.C.
(2020)
Living Machines: A study of atypical animacy
in COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference
Barbara McGillivray
Debates in Digital Humanities: Computational Humanities
Title | British Library Newspapers and Living with Machines |
Description | Talk on the British Library's Newspaper collection and the Living with Machines project for Congre`s Me´dias 19 - Numapresse : Presses anciennes et modernes a` l'e`re du nume´rique, La BnF, 3 juin. 2022 |
Type Of Art | Film/Video/Animation |
Year Produced | 2022 |
URL | https://figshare.com/articles/presentation/British_Library_Newspapers_and_Living_with_Machines/19963... |
Title | Echoing Through Time: New Tunes for Old Words |
Description | Leeds-based folk musicians were commissioned to record historical ballads from the British Library's collections that had been researched during the exhibition development process. They recorded the tracks, and subsequently posted them to the British Library's Soundcloud account. |
Type Of Art | Composition/Score |
Year Produced | 2022 |
Impact | The musicians are going on to release the recordings on CD, and have performed them live at events for the exhibition. |
URL | https://soundcloud.com/the-british-library/sets/echoing-through-time-new-tunes-for-old-words |
Title | Historic machines from 'prams' to 'Parliament': new avenues for collaborative linguistic research |
Description | Recording of presentation of long paper, DH Benelux 2022: RE-MIX. Creation and alteration in DH (Hybrid), 1-3 June 2022. Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship. More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution. During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. |
Type Of Art | Film/Video/Animation |
Year Produced | 2022 |
URL | https://zenodo.org/record/6583744 |
Title | Leeds musicians performing historical ballads from the British Library's collections for the Living with Machines exhibition |
Description | Leeds-based folk musicians were commissioned to record historical ballads from the British Library's collections that had been researched during the exhibition development process. They also performed the songs at the exhibition opening, and will perform them at future events related to the exhibition. |
Type Of Art | Performance (Music, Dance, Drama, etc) |
Year Produced | 2022 |
Impact | The musicians are packaging the recordings as a CD for sale at gigs etc. |
Title | Living with Machines: human stories from the industrial age |
Description | Living with Machines is the first large-scale exhibition developed in partnership between the British Library and Leeds Museums & Galleries. The exhibition is inspired by the Living with Machines research project. The free exhibition revisits the history of the industrial revolution in Britain through the lens of Leeds and the surrounding regions. It unearths forgotten stories revealing how rapid changes in technology in the nineteenth century changed life and work forever. Contemporary responses, offering reflections on the parallels between mechanisation in the 19th century and advances in AI and digital technology are woven throughout the display. The accompanying events programme includes loom weaving, crafts workshops, a Wiki edit-a-thon, and a special AI series as part of Leeds Digital Festival. The exhibition includes innovative digital interactives built with data crowdsourced through the project. Exhibition captions were written to be accessible to those without any knowledge of the topic, and to a reading age of approximately 10 years old. |
Type Of Art | Artistic/Creative Exhibition |
Year Produced | 2022 |
Impact | Exhibition research led to the recording of 19th century ballads by contemporary musicians. The events programme has drawn in a range of audiences, from families with very young children to workers in the tech industry. By the end of September, attendance figures were: For events: 1416 family programme visitors of all ages, and 141 adult event attendees. For the exhibition: 18,258 |
URL | https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-human-stories... |
Title | Newspaper Infographic Exhibition, British Library |
Description | LwM took responsibility for one of the panels in the British Library's (forthcoming) exhibition of nineteenth-century newspaper infographics. In collaboration with the Library's Lead Curator of News, Luke McKernan, and Yann Ryan; Daniel Wilson (History, text) and Mariona Coll Ardanuy (Computational Linguistics, code) conceived of an experimental panel to showcase our research using sentiment analysis on historical newspapers. The panel was made by infographic designer Ciaran Hughes, using datasets provided by the project which focused on emotional responses to industrialisation as seen in newspaper headlines. The exhibition will involve six such panels and will use modern infographic presentational techniques on historical data to tell arresting new stories about nineteenth-century Britain. The exhibition will open on the Lower Ground Floor of the BL in Spring 2021 and will hopefully be seen by large numbers of people and be reported in the press itself. |
Type Of Art | Artistic/Creative Exhibition |
Year Produced | 2020 |
Impact | We hope the exhbition will be a corrective to the badly researched uses of historical newspapers to make methodologically unsound claims about the past, and instead showcase a more credible way to apply data science to historical materials, while simultaneously grabbing attention and showcasing the work of the project. |
Description | The scope and ambition of this project can be summed up in terms of its two foundational objectives. Living with Machines (LwM) sought to understand what computationally-driven historical research is now possible in light of more than two decades of investment in the digitisation of our national cultural assets. And it sought to realise that research potential through a radical experiment in collaboration, bringing together an extremely large and diverse team of researchers and professionals. On both counts, LwM was a huge success. Firstly, LwM has amassed a vital evidence-base for helping us to understand the affordances of digitised holdings, as well as the barriers for research, within the current cultural heritage data landscape in the UK. The project has provided models for working within that landscape as well as making recommendations for changes to policy in our book Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project (Cambridge University Press, 2023) (CHR). LwM has also created a whole host of new assets which help to make digitised collections research-ready. These assets comprise new datasets (including new digitised content, derived data from existing collections, and databases) and code (including Jupyter notebooks, code accompanying publications, pipelines, and documented software). At the applied level, these combined assets have unlocked a huge number of research opportunities and new historical insights. At the more general level, they represent potential building blocks towards a modularised research infrastructure. In this space we have also seen a number of stand-out successes, including the development of 'MapReader' a computer vision pipeline that has been harnessed successfully not only for the distant viewing and searching of map sheets at scale, but also for the analysis of biological image data - demonstrating how the humanities can develop technical products that have an impact in the sciences. More importantly, we have begun to develop frameworks for building sustainable communities around these assets through training and workshops, which will influence and be extended by one of LwM's follow-on projects. Secondly, as an experiment in radical interdisciplinary research, the project has also delivered beyond its expectations. This is not to say that everything ran without friction, but rather that we learned huge amounts about one another's fields, about different ways of working and how to work together, and we took time regularly to reflect and decide how to continue in light of what we had learned. As a result we were able to develop our practice and research iteratively, in ways which played to new realities and new strengths. For many members of the team, this experience had been truly transformative: all members will be taking these new skills and experiences forward with them, and we were also able to make concrete recommendations back to our communities in light of this process through our CHR book, our 10-part docu-series, talks, and blog posts, As a result of the progress made in these two areas, we are now at the point where we can begin to write new histories of the impact of mechanisation on the lives of ordinary people in the long nineteenth century. We are in the process of publishing these through a series of articles and a multi-authored book (under contract), Living with Machines: Computational Histories of the Age of Industry. |
Exploitation Route | We have created new tools and datasets for a number of different communities which we believe have the power to drive forward research by leaps, rather than increments. Crucially we have sought to ensure that this progress can be carried out beyond our team by making all our code openly available, and by publishing data for reproducibility. However, where many endeavours in this space fall short is with the attitude 'if you build it, they will come'. New methods, tools and datasets are of no use if nobody knows about them. In our final phase of the project we have been focusing on ensuring the legacy of our collaboration by developing user communities around our most important tools and methods through blog posts, workshops, in-person and published tutorials. The specific work of building communities around these assets will be driven forward by the spin-off project 'Building sustainable communities around datasets and software', which began shortly after the end of Living with Machines (LwM) and is led by Pieter Francois (Turing/Oxford), with several members of LwM as Co-Is (Ahnert, Beavan, Nanni, Hobson, and McDonough). The proposal for this project pinpointed the problems of project-based funding for digital research, which often equates to poor return on investment due to the lack of infrastructure to support outputs and their uptake beyond project's end date in terms of hosting, maintenance, or human expertise. The team's proposal extended the blueprint of community development already tested on LwM, proposing that components could be made more generalisable if they were well packaged and documented, and if communities of users and maintainers were actively built around them, the UK could create the basic components of a modular research infrastructure. This project joins the suite of spin-off projects co-developed by team members, including Machines Reading Maps (McDonough), The Congruence Engine (Wilson), Impresso 2 (Beelen), and The Collective Wisdom project (Ridge). Finally, we sought to seed further innovative engagement with our project's assets by funding six 'digital residencies'. These residencies were small fellowships or project awards designed to enable work around one of our datasets or tools. These include data visualisations, a visual poem, a performance piece, a newspaper data processing pipeline, and an online book of tutorials about how to work with newspaper data. The fruits of their labour are reported on our project blog (https://livingwithmachines.ac.uk/latest/) and in reports deposited on the British Library's Research Repository (https://bl.iro.bl.uk/). |
Sectors | Creative Economy Digital/Communication/Information Technologies (including Software) Culture Heritage Museums and Collections |
URL | https://livingwithmachines.ac.uk/ |
Description | Thanks to the leadership provided by the British Library (BL), the public engagement work on the project has been first class. The project exhibition was attended by over 42,000 people, and the crowdsourcing work put us in contact with over 5,500 volunteers. As our overview of stakeholder feedback shows, the project has had an impact both on the future of humanities work at the Turing, and on our partners in the cultural heritage sector. The BL reports benefits from their collaboration with Leeds City Museum on the exhibition, from the programme of crowdsourcing, through the enrichment of existing Digital Scholarship Training Programme, as well as in the development of spinoff projects (such as the development of Flyswot). More broadly, the experience of LwM will shape the BL's future AI strategy in important ways. In addition we have been excited to see how effectively we have been able to stimulate the use of our data and code assets through the mechanism of the Digital Residencies. These have not only delivered innovative outcomes, but extended the community gathering around our work into areas such as art and performance. We have also been able to reach a much larger number of people through training and workshops due to the increased effort in that area compared to our plans at the outset. We are also very pleased that our research has also reached the public through the development of our work on OS maps into a story run by The Economist in April 2023. During the months following the official end of the project, we are also releasing episodes from our documentary series via The Alan Turing Institute's Youtube channel (https://www.youtube.com/playlist?list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb) which are designed to make the different research outcomes from our project accessible to the general public. |
First Year Of Impact | 2023 |
Sector | Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections |
Impact Types | Cultural |
Description | Advisory Board member (and author of white paper) 'iDAH Research Software Engineering (RSE) Steering Group Working Paper' |
Geographic Reach | National |
Policy Influence Type | Participation in a guidance/advisory committee |
Description | Andre Piza participated in European Commission's "Study on Opportunities and challenges of Artificial Intelligence Technologies for the Cultural and Creative Sectors" |
Geographic Reach | Europe |
Policy Influence Type | Contribution to a national consultation/review |
Description | British Library Research Report 2018-19 |
Geographic Reach | National |
Policy Influence Type | Citation in other policy documents |
Impact | The British Library Research Report features the Living with Machines project accounting for its impact on the British Library's activities. According to the report, "The project has already helped the Library explore the potential and challenges of data science methods, including copyright, the use of cloud-based services at scale, and meshing digitisation and analytical timeframes." *(1) and it is "advancing our [the BL's] capability to undertake computational analysis using very large and heterogeneous digitised sources, and our understanding of types of infrastructure that will enable us to deploy more data-driven research in the future."*(2) *(1) Mia Ridge, British Library's Digital Curator for Western Heritage Collections (and Co-I on Living with Machines) *(2) Maja Maricevic, British Library's Head of Higher Education and Science (and Co-I on Living with Machines) |
URL | https://www.bl.uk/news/2020/november/publication-of-2018-19-research-report |
Description | Guest lecture and assigned reading: Crowdsourcing at the British Library for UCL's MSc in Data Science for Cultural Heritage |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Guest lecture: Europeana masterclass for Open Digital Cultural Heritage |
Geographic Reach | Europe |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Guest lecture: INOS project, overview of citizen science and crowdsourcing |
Geographic Reach | Europe |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | https://inos-project.eu/2021/07/28/workshop-report-citizen-science-why-get-involved/ |
Description | Hands-on workshop 'Planning Crowdsourcing Projects in Cultural Heritage' for Europeana Research and the Europeana Research Community |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | Participants reported improved abilities to undertake crowdsourcing projects. As the workshop was held a few weeks ago, evidence is still being gathered. |
Description | Invited lecture, Crowdsourcing at the British Library |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Participation as a case study in AHRC's Technician Commitment Action Plan |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Impact | RTP case studies will be used for internal and external purposes. In the first instance, they will be shared with colleagues across the organisation to increase everyone's understanding of the term Research Technical Professional (technician) within the context of the Arts and Humanities. This will enable AHRC colleagues to confidently identify members of the RTP community working within their respective schemes and keep members of this community informed of how we are championing the Technician Commitment. Case Studies will help to ensure that decisions made at a strategic and governance level are well informed by the varied experiences of RTPs. In the future we will be keen to publish case studies on external platforms such as our website and communication channels, for example the AHRC newsletter and blog. Most importantly, Case Studies will inform research, dialogue and future activity with regards to AHRC's Technician Commitment Action Plan. |
Description | Project output used in collaborative workshop with Estonian museums |
Geographic Reach | Europe |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | Attendees developed skills at the workshop, enhanced by their access to our Open Access Handbook. |
URL | https://esm.ee/for-visitors/news/the-war-museum-helps-estonian-museums-to-put-crowdsourcing-into-use |
Description | Ruth Ahnert, fed into Forecasting Forum on the Future of Research, at thinktank Demos, December 5th 2019, London. |
Geographic Reach | National |
Policy Influence Type | Participation in a guidance/advisory committee |
URL | https://demos.co.uk/wp-content/uploads/2019/10/Jisc-OCT-2019-2.pdf |
Description | Special lecture 'Things to Know When Planning Crowdsourcing Projects in Cultural Heritage' for Europeana |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Task & Finish Group for JISC |
Geographic Reach | National |
Policy Influence Type | Contribution to new or improved professional practice |
URL | https://digitisation.jiscinvolve.org/wp/2023/02/03/is-ai-for-me |
Description | Congruence Engine -- Towards a National Collection Discovery Project -- Secondment for Daniel Wilson to Science Museum Group |
Amount | £3,000,000 (GBP) |
Organisation | Arts & Humanities Research Council (AHRC) |
Sector | Public |
Country | United Kingdom |
Start | 11/2021 |
End | 07/2024 |
Description | From crowdsourcing to digitally-enabled participation: the state of the art in collaboration, access, and inclusion for cultural heritage institutions |
Amount | £64,801 (GBP) |
Funding ID | AH/T013052/1 |
Organisation | Arts & Humanities Research Council (AHRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2020 |
End | 12/2021 |
Description | Machines Reading Maps: Finding and Understanding Text on Maps |
Amount | £199,529 (GBP) |
Funding ID | AH/V009400/1 |
Organisation | Arts & Humanities Research Council (AHRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2021 |
End | 10/2022 |
Title | A Toponym Resolution Pipeline for Digitised Historical Newspapers |
Description | T-Res is an end-to-end pipeline for toponym detection, linking, and resolution on digitised historical newspapers. Given an input text, T-Res identifies the places that are mentioned in it, links them to their corresponding Wikidata IDs, and provides their geographic coordinates. T-Res has been developed to assist researchers explore large collections of digitised historical newspapers, and has been designed to tackle common problems often found when dealing with this type of data. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | The tool has already been used for enriching other datasets in the project, and is currently being used in different research experiments for finding places in historical newspapers. |
URL | https://github.com/Living-with-machines/T-Res |
Title | Beavan, D., Jackson, M. Plain text and metadata extraction tool |
Description | Tool for parallel processing of XML in METS/ALTO format for extraction of plain text and metadata fields, available in XSLT and Python versions. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This data wrangling tool facilitated downstream analysis of historical newspapers focussing on toponym resolution and OCR quality. It forms an essential part of the preprocessing pipeline that will be applied to new datasets whose acquisition is in progress. |
Title | Beelen, K., Lexicon Expansion Interface |
Description | Notebook for exploring word2vec models in order to build a lexicon that can trace certain topics in a collection. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | The Lexicon Expansion Interface allows users to navigate a vector space and expand a list of seed words into a Lexicon. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/lexicon-expansion/language-l... |
Title | Beelen, K., Lexicon Generator, a tool for generating contrastive lexicons using newspaper data |
Description | Notebook for building a lexicon by contrasting two corpora using the Fightin' Words algorithm created by Monroe et al, 2008. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This notebook is an implementation of the Monroe et al algorithm "Fightin' Words". It is a feature extraction algorithm that computes which words are most significantly associated with with a specific subcorpus. This notebook helps us to "profile" certain types of language (e.g. contrast conservative to liberal newspapers) |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/lexi... |
Title | Beelen, K., Newspaper metadata database and search interface: scripts to build an ElasticSearch index and explore the data using Kibana |
Description | Scripts to build an ElasticSearch index and explore the data using Kibana |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Newspaper metadata database and search interface. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/elast... |
Title | Beelen, K., Pipeline for processing the Newspaper Press Directories |
Description | The series of notebooks includes a pipeline for processing the OCR (derived from the scans of Mitchell's Press Directories). The stages include: annotation, preprocessing, automatic tagging and database ingest. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | This tool will be crucial for parsing and enriching implicitly structured data (such as the press directories, but also other historical sources). |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/ndp_p... |
Title | Code for Targeted Sense Disambiguation |
Description | Code for Targeted Sense Disambiguation and reproducing results published in the http://dx.doi.org/10.18653/v1/2021.findings-acl.243 |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | Reproducing results of the paper. Tools for historical sense disambiguation. |
URL | https://github.com/Living-with-machines/TargetedSenseDisambiguation |
Title | Coll Ardanuy, M., Hosseini, K., van Strien, D., McDonough, K., Wilson, D., Krause, A., underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources' |
Description | Underlying code for the paper 'Resolving Places, Past and Present: Toponym Resolution in Historical British Newspapers Using Multiple Resources'. Resolving Places is one of the first outputs of Living with Machines, a collaborative digital history project at The Alan Turing Institute and the British Library. This research is part of our work to build a nineteenth-century gazetteer that combines place names derived from historical sources (GB1900) with online resources (Wikipedia and Geonames). GB1900 is the result of a crowdsourced project that transcribed all text labels on the 2nd edition 6-inch to 1 mile Ordnance Survey maps of Great Britain (ca. 1900) held by the National Library of Scotland (NLS Maps online). The Living with Machines gazetteer follows best practices in combining multiple existing resources, and is novel in accounting for places that have different scales (e.g. streets, buildings, cities, counties). In the future, we will be adding records and enriching current records with information from OS map 1st edition map label data and other sources. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | This work was presented at a workhsop on 27-28 November. Several attendants to the workshop showed interest in using the gazetteer produced through this code. Subsequent completed work and work in progress uses it, within and outside our project. |
URL | https://github.com/alan-turing-institute/lwm_GIR19_resolving_places/ |
Title | Coll-Ardanuy, M., Code that builds a gazetteer from scratch |
Description | Code and method to generate a gazetteer from Wikipedia and enriched with Geonames data. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Part of larger workflow to create a geographical knowledge base that combines different 19thC knowledge sources together. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language-lab-mro/gaze... |
Title | Coll-Ardanuy, M., Hosseini, K., Nanni, F., Toponym Matching |
Description | This work looks for potential locations for each toponym identified in text, it addresses issue of high degree of variation in toponyms (due to regional spelling differences, transliterations strategies, cross-language and diachronic variation) and variations due to OCR errors. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | We have built a flexible deep learning framework for candidate selection through toponym matching, using various state-of-the-art neural network architectures (DeezyMatch). The paper that accompanies this repository assesses the performance of DeezyMatch in different experimental settings. The DeezyMatch repository has had a notable impact, this accompanying repository is used for reference. |
URL | https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching |
Title | DeezyMatch Tutorials |
Description | The "DeezyMatch_tutorials" Github repository is a collection of tutorials for DeezyMatch (a free, open-source software library written in Python for fuzzy string matching and candidate ranking, developed within the Living with Machines project). In this repository, we collect some tutorials for DeezyMatch, and provide new code for a tutorial offered at the Digital Humanities 2022 conference. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | This repository collects a series of tutorials for DeezyMatch, to other researchers use the tool. The main DeezyMatch repository has 105 stars and 29 forks (as of June, 2023). |
URL | https://github.com/Living-with-machines/DeezyMatch_tutorials |
Title | Hobson, T., Tolfo, G. Methodological paper on Living with Machines' metamodel |
Description | Data modelling methodology developed to underpin data infrastructure with the aim of promoting interoperability of tools and systems and accessibility of data and derived artefacts within the project and externally. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | The common data model developed by this method has been used in the design of relational database schemas and other research infrastructure to support interoperability across different source data types and varied research activities. |
URL | https://www.overleaf.com/read/qjqqfdrqxkpr |
Title | Hosseini, K. and Vane, O. PressPicker code |
Description | The PressPicker tool can be used to filter and visualise British Library holdings of undigitised newspapers as a function of time. It is also an interactive tool to pick newspaper titles (e.g. for digitisation). It consists of two Python Jupyter notebooks and a custom JavaScript interactive visualisation. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Successfully made two selections of newspaper titles for digitising within Living with Machines. |
Title | Hosseini, K., Beelen, K., basic lexicon expansion algorithms using word embeddings |
Description | In this notebook, we use the trained word embeddings (using word2vec or fasttext models) to explore the semantic space of our book and sample newspaper datasets. Several basic methods are implemented, e.g. explore the neighbouring words given a seed word (e.g., what are the most similar words to "machine" given our corpus?); visualisation of word vectors using t-SNE. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This work is in progress. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language-lab-mro/lexi... |
Title | Hosseini, K., Nanni, F., Coll-Ardanuy, M., DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching |
Description | A free, open-source software library written in Python for fuzzy string matching and candidate ranking. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | String matching is an integral component of many natural language processing (NLP) pipelines. DeezyMatch, a new deep learning approach to fuzzy string matching and candidate ranking, is a free, open-source community software that strives to address advanced string matching and candidate ranking challenges in a more comprehensive and integrated manner than existing tools. DeezyMatch is written in the Python programming language. Thanks to its easy-to-use interfaces, DeezyMatch can be seamlessly integrated into existing entity linking systems. This allows DeezyMatch to be adopted outside the NLP community, especially in Digital Humanities, where it could play a major role in addressing known issues concerning the adoption of entity linking systems due to the non-standard nature of the datasets typically used in this field. DeezyMatch has been the topic of a tutorial and round table (at the LinkedPasts conference 2020) and of an interactive workshop (at the Alan Turing Institute Digital Humanities and Research Software Engineering Summer School, 2021). The GitHub repository has 64 stars and 26 forks. |
URL | https://github.com/Living-with-machines/DeezyMatch |
Title | Hosseini, K., exploratory data analysis of GB1900 dataset |
Description | A set of Jupyter-notebooks for visualisation and statistical analysis of GB1900 dataset. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | These Jupyter-notebooks were developed to explore the GB1900 dataset, including visualisation of various entities (e.g., railway) on a map. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/space-time-mro/gb1900... |
Title | Hosseini, K., exploratory data analysis of newspaper/book databases |
Description | A set of Jupyter-notebooks to perform exploratory data analysis on newspaper and book databases. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | These notebooks were developed as teaching/research tools to: 1) show how to access a remote Postgres DB, query, plot the results. 1) exploratory data analysis (e.g., visualisation and simple statistical analysis) on the data. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/relational_database_e... |
Title | Hosseini, K., from raw data to language-models/word-embeddings |
Description | These notebooks combined form a pipeline in which raw book/newspaper textual data can be accessed, preprocessed and then used to generate word embeddings and language models. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | These notebooks (and their Python-script version) have been extensively used to generate word2vec, fasttext, Flair and BERT language models. These models are being used in several NLP-related projects. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/noteb... |
Title | Hosseini, K., intrinsic evaluation of word embeddings / language models |
Description | The performance of any trained machine learning model needs to be evaluated (intrinsically or extrinsically) before being used. Here, we collected several datasets and developed a set of codes to evaluate trained word embeddings and language models. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Evaluation of all word-embeddings/language models being used in the project. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb... |
Title | Hosseini, K., parallel processing of book (and newspaper) dataset using MPI (Message Passing Interface) |
Description | As we are dealing with a large textual data (e.g., our book dataset contains 4.5B words), we started to experiment with different distributed and parallel algorithms to preprocess and to train machine learning models. Here, we used MPI (Message Passing Interface) through Python. This code distributes the job among the requested number of CPUs (workers) which can be on different nodes in a supercomputer (i.e. not limited to shared-memory machines); therefore, it significantly reduces the wall time. This code was tested on Urika. Unfortunately, Urika is not available anymore, and now, we are exclusively using Azure virtual machines (VM). These VMs are shared-memory, so we switched to simpler parallel-processing algorithms. However, the MPI algorithm and tools developed here should be usable later when we have access to even larger datasets. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Preprocess and extract information (e.g., part-of-speech tagging) from large textual datasets. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/language_models/mpi_v... |
Title | Hosseini, K., record linkage using various multi-class classifiers and manual annotations |
Description | Record linkage across two noisy datasets (for example, historical texts) is a non-trivial task. In this tool, we experimented with different multi-class classifiers, e.g. decision tree and multilayer perceptron architectures. We also assessed the impact of features (e.g., title, date and place of publication) on the statistical performance of these models. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Creating a list of linked entities between NPD (newspaper press directory) and British Library titles list. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/sources-lab-mro/linki... |
Title | Hosseini, K., upload images to Zooniverse |
Description | ~10,000 images from the digitised newspaper articles were selected and uploaded to Zooniverse for annotation. Defoe, a spark-based toolbox for analysing digital historical textual data, was used to select the images. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The human/expert annotation is one of the main ingredients in training and evaluating supervised machine learning methods. The results of this experiment can be used in various tasks, e.g., sentence/document classification. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/tree/master/communities-mro/zooni... |
Title | Living with Machines GitHub Stats report |
Description | This repository automatically updates GitHub statistics data for the Living with Machines GitHub Organization and generates a report based on this data. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | 38 unique repository views |
URL | https://github.com/Living-with-machines/github_stats_report |
Title | Materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School |
Description | The dhoxss-text2tech Github repository contains the materials for the Text to Tech workshop at the Digital Humanities Oxford Summer School (used in the 2022 and 2023 editions). This hands-on workshop offers an introduction to programming in python and to natural language processing, from processing texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. The materials are publicly available, and consist of a series of jupyter notebooks, each covering a different topic. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | These materials were used in the 2022 and 2023 editions of the Text to Tech strand of the Digital Humanities Oxford Summer School. In the 2022 edition, 35 students attended the strand. As part of the Summer School feedback survey, one of the participants said that "the notebooks were excellent and I know they will be a resource that I and the other students will keep going back to". |
URL | https://github.com/Living-with-machines/dhoxss-text2tech |
Title | Neural Language Models for Historical Research |
Description | We have pre-trained four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings (i.e., word2vec and fastText), and four instances considering different time slices for BERT. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | The repository has had several forks and the language models are already being used by several researchers external to the project. |
URL | https://github.com/Living-with-machines/histLM |
Title | Repository for code underlying the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020) |
Description | This repository provides underlying code and materials for the paper 'Living Machines: A Study of Atypical Animacy' (COLING2020). |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | This is the code accompanying the paper "Living Machines: A study of atypical animacy" (2020). This paper has already been cited three times in external publications, and the GitHub repository has four external stargazers and one fork. The code in this paper has been used and adapted in a forthcoming publication from the Living with Machines project. |
URL | https://github.com/Living-with-machines/AtypicalAnimacy/ |
Title | Station to Station: Linking and Enriching Historical British Railway Data |
Description | This repository provides underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data', accepted at the Computational Humanities Research conference (2021). It contains the steps to reproduce the experiments reported in the paper and to generate a structured version of the Michael Quick's book "Railway Passenger Stations in Great Britain: a Chronology". |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | This repository contains the code used to generate StopsGB (Structured Timeline of Passenger Stations in Great Britain, https://https//doi.org/10.23636/wvva-3d67). This dataset is currently being used in other projects within Living with Machines, and we believe it will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. The code used to generate a gazetteer is already being used in the Machine Reading Maps project. |
URL | https://github.com/Living-with-machines/station-to-station |
Title | Vane, O. OS maps metadata visualisation code |
Description | Custom visualisation of digitised 19th Century Ordnance Survey maps (from National Library of Scotland) to investigate patterns of map revision through time. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Used tool to create supporting material for BL map digitisation proposal and to help identify suitable locations for historical case studies (factors include OS map coverage). |
Title | Vane, O., Code for filtering Kings Topographical map collection metadata |
Description | Python Jupyter notebook for filtering British Library KTop metadata by geography and time. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Identifying relevant digitised material for Living with Machines research. |
Title | Vane, O., Code underlying a blogpost about how to put a D3 JavaScript visualisation in a Python Jupyter notebook. |
Description | Jupyter notebook demonstrating how to use JavaScript and the D3 visualisation library in a Python Jupyter notebook. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | Email from a blog reader describing it as very helpful. |
URL | https://github.com/alan-turing-institute/D3_JS_viz_in_a_Python_Jupyter_notebook |
Title | Vane, O., Strabo output visualisation code |
Description | Visualising the output of 'Strabo' tool (software tool to auto-transcribe text in historical maps by researchers at the University of Southern California Spatial Informatics Laboratory). |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Non statistical evaluation of Strabo tool success with our map data. |
Title | Wiki2Gaz: A series of scripts to create a gazetteer from Wikipedia and Wikidata |
Description | This repository contains a series of scripts to create a Wiki-based resources which can be used for different geographic entity linking tasks. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | The resources generated using these scripts are used by the T-Res tool, and have also been used for enriching other datasets in the Living with Machines project. |
URL | https://github.com/Living-with-machines/wiki2gaz |
Title | Working with maps at scale using Computer Vision and Jupyter notebooks (Notebook/code) |
Description | Notebook showing how to use computer vision/Jupyter Notebooks to support working with image collections at scale. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | Materials used at a workshop with ~30 attendees. |
URL | https://github.com/Living-with-machines/maps-at-scale-using-computer-vision-and-jupyter-notebooks |
Title | gh_orgstats |
Description | gh_orgstats is intended to provide some easy ways of getting stats for a GitHub org. gh_orgstats does this by wrapping some functions around PyGithub. This code is mainly intended to help generate reports as part of a GitHub actions pipeline to update GitHub usage stats for a funder. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | 56 unique GitHub Clones of the repository hosting the code |
URL | https://github.com/Living-with-machines/gh_orgstats |
Title | van Strien, D., Beelen, K., Coll Ardanuy, M., Hosseini, K., McGillivray, B., Colavizza, G., underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks' |
Description | These notebooks contain the underlying code for the paper 'Assessing the Impact of OCR Quality on Downstream NLP Tasks'. The code runs experiments reported in the paper and generates the figures used in the paper. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | This code helps the project better understand issues relating to OCR technology and will inform research methods for our projects and other projects working with text produced through OCR. |
URL | https://github.com/alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks |
Title | van Strien, D., Beelen, K., McDonough, K. 4 Jupyter notebooks on basic computer vision methods for historic OS maps |
Description | These notebooks provide an explanation on using computer vision methods with historic maps. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | These notebooks have been used in two workshops with >40 participants. They will be developed further into a series of tutorials. |
Title | van Strien, D., Beelen, K., McDonough, K. 5 Jupyter notebooks on using Deep-learning methods for computer vision on historic OS maps |
Description | Additional notebooks on using computer vision methods with historic digitised map collections. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | No |
Impact | These notebooks have been used as teaching materials in two workshops and will be developed further into publicly available tutorials. |
Title | van Strien, D., Prototype Maps annotation pipeline |
Description | A prototype method for collecting annotations from researchers, running classification and analysing historic maps at scale. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2020 |
Provided To Others? | No |
Impact | These methods have been used as an initial prototype which is currently being developed further inside the project. |
Title | 19th Century United States Newspaper Advert Classifications |
Description | A dataset of images drawn from the Library of Congress Newspaper Navigator Dataset (news-navigator.labs.loc.gov/). The dataset contains images and annotations used for training computer vision models to classify whether an adert is illustrated or not. This is a supplement to a forthcoming programming historian lesson (programminghistorian.org/) but can be used indepently of this lesson. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | No |
Impact | The dataset will be made public to coincide with the release of the Programming Historian Tutorials. |
Title | 19th Century United States Newspaper Advert images with 'illustrated' or 'non illustrated' labels |
Description | The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'advertisements. This dataset contains a sample of these images with additional labels indicating if the advert is 'illustrated' or 'not illustrated'. The data is organised as follows: The images themselves can be found in `images.zip` `newspaper-navigator-sample-metadata.csv` contains metadata about each image drawn from the Newspaper Navigator Dataset. `ads.csv` contains the labels for the images as a CSV file `sample.csv` contains additional metadata about the images (based on the newspapers those images came from). This dataset was created for use in an under-review Programming Historian tutorial (http://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt1) The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly. The metadata CSV file contains the following columns: - filepath - pub_date - page_seq_num - edition_seq_num - batch - lccn - box - score - ocr - place_of_publication - geographic_coverage - name - publisher - url - page_url - month - year - iiif_url |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4075210 |
Title | 19th Century United States Newspaper images predicted as Photographs with labels for "human", "animal", "human-structure" and "landscape" |
Description | The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'photographs'. This dataset contains a sample of these images with additional labels indicating if the photograph has one or more of the following labels: "human", "animal", "human-structure" and "landscape" The data is organised as follows: The images themselves can be found in `images.zip` `newspaper-navigator-sample-metadata.csv` contains metadata about each image drawn from the Newspaper Navigator Dataset. `multi_label.csv` contains the labels for the images as a CSV file `annotations.csv` conains the labels for the images with additional metadata This dataset was created for use in an under-review Programming Historian tutorial (http://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt2) The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly. The metadata CSV file contains the following columns: - filepath - pub_date - page_seq_num - edition_seq_num - batch - lccn - box - score - ocr - place_of_publication - geographic_coverage - name - publisher - url - page_url - month - year - iiif_url |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4487140 |
Title | Alston Herald, and East Cumberland Advertiser |
Description | Alston Herald, and East Cumberland Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/31ee3711-05f9-4c94-847d-f922cc12cc36 |
Title | Atherstone, Nuneaton, and Warwickshire Times |
Description | Atherstone, Nuneaton, and Warwickshire Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/1efd2da4-0289-48cf-ad93-3e63139f22cd |
Title | Barrow Herald and Furness Advertiser |
Description | Barrow Herald and Furness Advertiser. (1863 - 1914) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/33b9bd5f-5ea0-4397-883b-cd04b91a7f39 |
Title | Birkenhead News |
Description | Birkenhead News was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/30830be0-b512-4609-8e3d-be5b7f2b1498 |
Title | Blandford Weekly News |
Description | Blandord Weekly News was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/7da92592-0a38-443e-8c3c-622284b57ace |
Title | Book and newspaper databases |
Description | This database consists of ~49K books (metadata and full-text, 4.5B words) and 11.8M newspaper pages (only metadata). We used "Azure Database for PostgreSQL" service to manage this database.Various codes/jupyter-notebooks are developed to access this database and perform exploratory data analysis. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This database has been used in various text mining and natural language processing tasks, such as: 1) Generating language models including word2vec, fasttext, Flair and BERT type models. The book database was mainly used here as it has a large number of books suitable for training stable language models; however, we also trained several models using a sample from newspaper articles. 2) Pre-trained models used in "Assessing the Impact of OCR Quality on Downstream NLP Tasks" paper. 3) Developing the processing pipeline. |
Title | Bridlington and Quay Gazette |
Description | Bridlington and Quay Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/111e7722-a223-41af-af60-12b7bfeeb1d9 |
Title | Bridport, Beaminster and Lyme Regis telegram |
Description | Bridport, Beaminster and Lyme Regis telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/a909ab42-2374-4517-aa2d-67310922e669 |
Title | Brighouse & Rastrick Gazette |
Description | Brighouse & Rastrick Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/a59056bf-46d6-427a-88fe-63dac27d0707 |
Title | British Library Books genre detection model |
Description | Model description This model is intended to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'. This model was trained on data created from the Digitised printed books (18th-19th Century) book collection. The datasets in this collection are comprised and derived from 49,455 digitised books (65,227 volumes), mainly from the 19th Century. This dataset is dominated by English language books and includes books in several other languages in much smaller numbers. This model was originally developed for use as part of the Living with Machines project to be able to 'segment' this large dataset of books into different categories based on a 'crude' classification of genre i.e. whether the title was `fiction` or `non-fiction`. The model's training data (discussed more below) primarily consists of 19th Century book titles from the British Library Digitised printed books (18th-19th century) collection. These books have been catalogued according to British Library cataloguing practices. The model is likely to perform worse on any book titles from earlier or later periods. While the model is multilingual, it has training data in non-English book titles; these appear much less frequently. How to use To use this within fastai, first install version 2 of the fastai library. Following the documentation instructions. Once you have fastai installed, you can use the model as follows:
Limitations and bias The model was developed based on data from the British Library's Digitised printed books (18th-19th Century) collection. This dataset is not representative of books from the period covered with biases towards certain types (travel) and a likely absence of books that were difficult to digitise. The formatting of the British Library books corpus titles may differ from other collections, resulting in worse performance on other collections. It is recommended to evaluate the performance of the model before applying it to your own data. Likely, this model won't perform well for contemporary book titles without further fine-tuning. Training data The training data for this model will be available from the British Libary Research Repository shortly. The training data was created using the Zooniverse platform. British Library cataloguers carried out the majority of the annotations used as training data. More information on the process of creating the training data will be available soon. Training procedure Model training was carried out using the fastai library version 2.5.2. The notebook using for training the model will be available at: https://github.com/Living-with-machines/bl-books-genre-prediction Eval result The model was evaluated on a held out test set:
precision recall f1-score support Fiction 0.91 0.88 0.90 296 Non-fiction 0.94 0.95 0.95 554 accuracy 0.93 850 macro avg 0.93 0.92 0.92 850 weighted avg 0.93 0.93 0.93 850 |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/5245174 |
Title | British Library Books genre detection model |
Description | This model is intended to predict, from the title of a book, whether it is 'fiction' or 'non-fiction'. This model was trained on data created from the Digitised printed books (18th-19th Century) book collection. The datasets in this collection are comprised and derived from 49,455 digitised books (65,227 volumes), mainly from the 19th Century. This dataset is dominated by English language books and includes books in several other languages in much smaller numbers. This model was originally developed for use as part of the Living with Machines project to be able to 'segment' this large dataset of books into different categories based on a 'crude' classification of genre i.e. whether the title was `fiction` or `non-fiction`. |
Type Of Material | Computer model/algorithm |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | Used as part of a forthcoming living with machines tutorial on genre classification |
URL | https://doi.org/10.5281/zenodo.5245175 |
Title | British Miner and General Newsman |
Description | British Miner and General Newsman was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/cd67adaf-bbaa-498e-9b1f-4a3c71e4ca68 |
Title | Central Glamorgan Gazette |
Description | Central Glamorgan Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/bef4a88c-ed21-4848-af9a-13c7a4b911a7 |
Title | Colne Valley Guardian |
Description | Colne Valley Guardian was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/edcbd1fc-e739-4fa9-8a12-ff0f02cc1cb6 |
Title | Cotton Factory Times |
Description | Cotton Factory Times (1885-1889, 1891-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/cf778baa-be01-4fe1-ae4d-4d3ef3fccf05 |
Title | Cradley Heath & Stourbridge Observer |
Description | Cradley Heath & Stourbridge Observer. (1864 - 1888) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/63dc3c3d-bbeb-48cf-86e1-cb203f8f0bf8 |
Title | Darlington & Richmond Herald |
Description | Darlington & Richmond Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/f88c6a06-cfff-43ac-9a69-49469b4e1ea7 |
Title | Dataset for Toponym Resolution in Nineteenth-Century English Newspapers |
Description | We present a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions of places, which are linked---whenever possible---to their corresponding entry on Wikipedia. The dataset is published on the British Library shared research repository, and is especially of interest to researchers working on improving semantic access to historical newspaper content. We share the 343 annotated files (one file per article) in the WebAnno TSV file format version 3.2, a CoNLL-based file format. We additionally provide a TSV file with metadata at the article level, and the annotation guidelines. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | This dataset has already been used by researchers working on the task of named entity recognition in historical digitised newspapers. This dataset will be used in the HIPE 2022 shared task ("Identifying Historical People, Places and other Entities", https://hipe-eval.github.io/HIPE-2022/) organised by the Impresso project, on "Named Entity Recognition and Linking in Multilingual Historical Documents". The dataset will be used by teams from different institutions to develop and assess the performance of state-of-the-art methods in the tasks of named entity recognition and entity linking. This is the second edition of the shared task, 13 teams participated in the first edition of this shared task. |
URL | https://bl.iro.bl.uk/concern/datasets/de43a15c-e000-4fec-8b66-7ca94ae13db3 |
Title | Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919) |
Description | Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project website (Living with Machines): https://livingwithmachines.ac.uk/ |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | The models are scalable and reusable, so that more research can be carried out with the same output. |
URL | https://zenodo.org/record/7887305 |
Title | Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919) |
Description | Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project website (Living with Machines): https://livingwithmachines.ac.uk/ |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/7887304 |
Title | Denton and Haughton Examiner |
Description | Denton and Haughton Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Variant titles are 1873-74 The Denton, Haughton, & District Weekly News. 1874-75 Denton & Haughton Weekly News, and Audenshaw, Hooley Hill, and Dukinfield Advertiser, 1875-78 Denton Examiner, Audenshaw, Hooley Hill and Dukinfield Advertiser, 1878-92 Denton and Haughton Examiner, etc. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/2a759cdb-6203-438d-8148-30b40bfe734c |
Title | Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England |
Description | Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the period 1830-1889. More models are on the way for the following: - each decade in the periods 1780-1829 and 1890-1920 for both North and South England. - diachronic models for the following regions: Scotland, Wales, and Midlands. The models were trained using the following parameters:
Like the embeddings in this repository, the model for each decade was aligned to the most recent one with Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData. Project website (Living with Machines): https://livingwithmachines.ac.uk/ Data related to: Nilo Pedrazzini & Barbara McGillivray, Diachronic and diatopic word embeddings from British historical newspapers, presented at AIUCD (Convegno dell'Associazione per l'Informatica Umanistica e la Cultura Digitale) in Siena (Italy), June 2023. |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | The models are scalable and reusable, so that more research can be carried out with the same output. |
URL | https://zenodo.org/record/7892460 |
Title | Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England |
Description | Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the period 1830-1889. More models are on the way for the following: - each decade in the periods 1780-1829 and 1890-1920 for both North and South England. - diachronic models for the following regions: Scotland, Wales, and Midlands. The models were trained using the following parameters:
Like the embeddings in this repository, the model for each decade was aligned to the most recent one with Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData. Project website (Living with Machines): https://livingwithmachines.ac.uk/ Data related to: Nilo Pedrazzini & Barbara McGillivray, Diachronic and diatopic word embeddings from British historical newspapers, presented at AIUCD (Convegno dell'Associazione per l'Informatica Umanistica e la Cultura Digitale) in Siena (Italy), June 2023. |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/7892459 |
Title | Diachronic word embeddings from 19th-century British newspapers |
Description | Word vectors related to the paper Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | . |
URL | https://zenodo.org/record/7181682 |
Title | Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919) |
Description | Word vectors related to the paper Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and the following parameters:
The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes. See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData Project webpage (Living with Machines): https://livingwithmachines.ac.uk/ |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/7181681 |
Title | Digitised historical newspapers |
Description | Newspapers digitised by the British Library for the LwM project, with OCR processing performed by FindMyPast and supplied in a format consistent with the BNA. The dataset comprises ~630 GB of digitised text in METS/ALTO XML format and 435,642 JP2 image files (~6 TB) for 94 newspaper titles. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | No |
Impact | Analysis of British historical newspaper content at scale. |
Title | Dorset County Express and Agricultural Gazette |
Description | Dorset County Express and Agricultural Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/7048dfa0-aea5-410f-a6dd-063b74a2c955 |
Title | Example computer vision classification training data derived from British Library 19th Century Books Image collection |
Description | Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | 85 Downloads of the dataset |
URL | https://zenodo.org/record/3689444 |
Title | Example computer vision classification training data derived from British Library 19th Century Books Image collection |
Description | Example computer vision classification training data derived from British Library 19th Century Books Image collection This dataset provides training data for image classification for use in a computer vision workshop. The images are derived from 'Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG' from the year '1839'. Currently, included are four folders containing a variety of images derived from the BL books corpus. 'cv_workshop_exercise_data' include images of: 'building', 'people', 'coat of arms' 'humancats' contains images of humans and images of cats The 'fashion' and 'portraits' folders both contain images of people organised into 'female' and 'male'. These labels were annotated by a single annotator and these categories may themselves not be meaningful. They are included in the workshop data as a point of discussion about how we should label data both in general and when working with historical data. This data is intended primarily as an educational resource. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/3667575 |
Title | Forest of Dean Examiner |
Description | Forest of Dean Examiner (1873-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/bde51795-fcf2-4a78-b232-47bae8b952c4 |
Title | Frederick May's London Press Dictionary and Advertiser's Handbook (1883-1911) |
Description | Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price publisher office political and religious leaning |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/536678c9-4c26-41d2-bcbc-5b209ab393b4 |
Title | Glasgow Chronicle |
Description | Glasgow Chronicle was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/f63cbf7f-0380-4520-86a5-671b555cb274 |
Title | Glasgow Courier |
Description | Glasgow Courier was a thrice weekly/bi-weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/e4a924fa-608f-443f-88f0-d8b42009c88f |
Title | Halifax Local Opinion |
Description | The Halifax Local Opinion was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Th is dataset (BLNewspapers_HalifaxLocalOpinion0003063_1892.zip) is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will be resolved and the dataset will be available by the end of March 2023. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/65f3bbae-e3d4-419b-bf94-c14b57a691c0 |
Title | Images from Newspaper Navigator predicted as maps, with human corrected labels |
Description | The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). The Newspaper Navigator dataset consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | Used at data for an example notebook showing how to train computer vision models. 59 downloads of the dataset (5/11/2020) |
URL | https://zenodo.org/record/4156510 |
Title | Images from Newspaper Navigator predicted as maps, with human corrected labels |
Description | The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4156509 |
Title | Images from Newspaper Navigator predicted as maps, with human corrected labels |
Description | The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/). [The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. source: https://news-navigator.labs.loc.gov/ One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data. This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'. The data is organised as follows: The images themselves can be found in 'newspaper_maps.zip' `2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset. map_labels.csv contains the labels for the images as a CSV file |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4156510 |
Title | Kasra Hosseini, language model zoo |
Description | Collection of trained word embeddings and language models, mainly by using the book database. Various model types are trained and added to the collection, e.g., word2vec, fasttext, contextual string embeddings (Flair), BERT. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Language models and word-embeddings are one of the main ingredients in many NLP-related tasks in this project. Here, we keep track of the trained models, so researchers can easily find the models and use them for their research. |
URL | https://github.com/alan-turing-institute/Living-with-Machines-code/blob/master/language_models/noteb... |
Title | Kenilworth Advertiser |
Description | Kenilworth Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/5c6af14a-5dba-4f60-b8a3-c919ce0e5ef6 |
Title | Lancaster Herald and Town and County Advertiser |
Description | Lancaster Herald and Town and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/da8cebfc-e531-443b-b1fe-b4b39d18c302 |
Title | Lancaster Standard and County Advertiser |
Description | Lancaster Standard and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/f3020250-fc33-4405-b28c-b0194a31e049 |
Title | Liverpool Weekly Courier |
Description | Liverpool Weekly Courier was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/b8d6e83d-57f8-4ca0-ac9c-5cf88cea48c3 |
Title | Living Machines atypical animacy dataset |
Description | Atypical animacy detection dataset, based on nineteenth-century sentences in English extracted from an open dataset of nineteenth-century books digitized by the British Library (available via https://doi.org/10.21250/db14, British Library Labs, 2014). This dataset contains 598 sentences containing mentions of machines. Each sentence has been annotated according to the animacy and humanness of the machine in the sentence. This dataset has been created as part of the following paper: Ardanuy, M. C., F. Nanni, K. Beelen, Kasra Hosseini, Ruth Ahnert, J. Lawrence, Katherine McDonough, Giorgia Tolfo, D. C. Wilson and B. McGillivray. "Living Machines: A study of atypical animacy." In Proceedings of the 28th International Conference on Computational Linguistics (COLING2020). |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/work/323177af-6081-4e93-8aaf-7932ca4a390a |
Title | Living with Machines Zooniverse Participant Survey |
Description | Summary results from a survey of contributors to Living with Machines Zooniverse crowdsourcing projects. Responses were received between 24 May and 13 June 2022. We designed the survey so that we could align our reporting with two other audience / participant research groups. Firstly, we used the demographic categories that the British Library use in other reporting, allowing us to see Zooniverse volunteers alongside other groups using the British Library's collections. Secondly, we aligned questions about motivations and barriers to participation with the CS Track citizen science research project survey. Our thanks to colleagues on the CS Track https://cstrack.eu/ project for permission to use options from their survey: Lampi, Emilia; Paajanen, Samu; Lämsä, Joni; Hämäläinen, Raija; Hästbacka, Heli; Sabel, Ohto. CSTrack Survey Data 2021. V. 12.8.2021. 10.17011/jyx/dataset/79371 https://jyx.jyu.fi/handle/123456789/79371 |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/cb1ce859-a35d-46bd-9362-02c005fb66eb |
Title | Living with Machines alpha and beta Zooniverse 'accident' task data |
Description | Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/work/4d262a8a-255b-45a1-a0fe-dc4af48e9798 |
Title | Living with Machines alpha and beta Zooniverse 'accident' task data |
Description | Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names mentioned, and add a brief summary of relevant accidents. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | Publishing the data is part of our contract with crowdsourcing participants, and provides evidence of our commitment to transparency and data sharing. |
URL | https://doi.org/10.23636/1197 |
Title | MapReader_Data_SIGSPATIAL_2022 |
Description | MapReader in GeoHumanities workshop (SIGSPATIAL 2022): Gold standards and outputs Refer to: https://github.com/Living-with-machines/MapReader/wiki/GeoHumanities-workshop-in-SIGSPATIAL-2022 |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/7116800 |
Title | MapReader_Data_SIGSPATIAL_2022 |
Description | MapReader in GeoHumanities workshop (SIGSPATIAL 2022): Gold standards and outputs Refer to: https://github.com/Living-with-machines/MapReader/wiki/GeoHumanities-workshop-in-SIGSPATIAL-2022 |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | Republished on National Library of Scotland Data Foundry website: https://data.nls.uk/data/map-spatial-data/living-with-machines-railspace-building/ |
URL | https://zenodo.org/record/7147906 |
Title | Mariona Coll-Ardanuy - Creation of toponym resolution datasets (ongoing). |
Description | Creation of toponym resolution datasets: ~1000 newspaper articles manually annotated with mentions of places and their geographical coordinates. The annotations are not yet complete. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | No |
Impact | Ongoing work. We aim at publishing the dataset as soon as the annotations are complete. They will serve to assess the performance of our toponym resolution method and will be a contribution to several fields, like geographic information retrieval, computational linguistics, and digital humanities. |
Title | Mariona Coll-Ardanuy, Creation of a gazetteer for toponym resolution (ongoing). |
Description | Creation of a gazetteer for toponym resolution (alpha version). This is a Wikipedia-based gazetteer, enriched with data from the geographical database Geonames. The alpha version of the code that creates the gazetteer has already been released (see URL below). This work is ongoing: we are working on enriching it with data from historical sources (maps and text). |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | The gazetteer has not been made available, but publication and the code repository with the instructions on how to create it are publicly available. |
URL | https://github.com/alan-turing-institute/lwm_GIR19_resolving_places |
Title | May's British and Irish Press Guide and Advertiser's Handbook & Dictionary etc. (1871-1880) |
Description | Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price, publisher, office, political and religious leaning. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/0a4f3f09-11ff-4360-a73e-ce3a7654f14c |
Title | Midland Examiner and Wolverhampton Times |
Description | Midland Examiner and Wolverhampton Times (1874-1878) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/d4cf88ee-0df9-4b38-9f40-9ec1d04e7e56 |
Title | Neural Language Models for Nineteenth-Century English |
Description | We present four types of neural language models trained on a large historical dataset of books in English, published between 1760 and 1900, and comprised of ˜5.1 billion tokens. The language model architectures include word type embeddings (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the type embeddings, and four instances considering different time slices for BERT. Our models have already been used in various downstream tasks where they consistently improved performance. |
Type Of Material | Computer model/algorithm |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | Even though word2vec has been around for almost a decade-an eternity in the fast-moving NLP ecosystem-the word type embeddings it produces persist as popular instruments, especially for interdisciplinary research (Azarbonyad et al. 2017; Hengchen, Ros, & Marjanen, 2019). The more recent fastText model extends on word2vec by using subword information. Contextualized language models have meant a breakthrough in NLP research (e.g. Smith (2019) for an overview), as they represent words in the contexts in which they appear, instead of conflating all senses, one of the main criticisms of word type embeddings. The potential of using such models for historical research is immense as they allow a more accurate context-dependent representation of meaning. These embeddings can also be used in existing tools for historical research (e.g. Hosseini, Nanni, and Coll Ardanuy (2020)). Given that existing libraries, such as Gensim, Flair, or Hugging Face, provide convenient interfaces to work with these embeddings, we are confident that our historical models will serve the needs of a wide-variety of scholars, from NLP and data science to the humanities, for different tasks and research purposes, such as measuring how words change meaning over time (Kulkarni, Al-Rfou, Perozzi, & Skiena, 2015; Tahmasebi, Borin, & Jatowt, 2018), automatic OCR correction (Hämäläinen & Hengchen, 2019), interactive query expansion12 or, more generally, any research that involves diachronic language change. |
URL | https://zenodo.org/record/4782245 |
Title | Neural Language Models for Nineteenth-Century English (dataset; language model zoo) |
Description | This dataset contains four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include static (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance using the whole dataset. Additionally, we trained separate instances on text published before 1850 for the two static models, and four instances considering different time slices for BERT. Github repository: https://github.com/Living-with-machines/histLM |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4779090 |
Title | Newspaper Directories digitised, OCRed, modelled and structured data extracted from Mitchell's directories (1846-1909) |
Description | This collection includes a subset of Mitchel's Newspaper Press Directories which is annotated and structured for future incorporation in the Newspaper database. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | The information extracted from the Press Directories will significantly contribute to enriching newspaper data received from Heritage Made Digital, FindMyPast and JISC. It will also contribute to the environmental scan project and paper. |
Title | North Cumberland Reformer |
Description | North Cumberland Reformer (1890 - 1898) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/1badb649-ad58-416f-a403-13228780c964 |
Title | Northern Guardian (Hartlepool) |
Description | Northern Guardian (Hartlepool) (1891 - 1902) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/62fccfe5-7ed8-46aa-acd1-47a6b69dd7fb |
Title | Northern Weekly Gazette |
Description | Northern Weekly Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://nms.iro.bl.uk/concern/datasets/73abb348-9b50-429e-89a2-d304c1fbcee6 |
Title | Nuneaton Times |
Description | Nuneaton Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/c0b347e8-1075-4c90-8ab2-b21cab338ef5 |
Title | Ordnance Survey Old / First series England and Wales 1:63360 (georeferenced sheet images) |
Description | Map sheet images for the Ordnance Survey Old Series / First Series England and Wales 1:63360, georeferenced and cropped at the neatlike (can be viewed together as a seamless composite). Geotiff format. The original (ungeoreferenced) sheet images can be found at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets). The sheets were georeferenced by relating the sheet corners to their coordinates (no internal control points applied), |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | . |
URL | https://bl.iro.bl.uk/concern/datasets/2fa13eb5-1767-469b-b4c0-d9d518bfc1b3#?c=0&m=0&s=0&cv=0&xywh=0%... |
Title | Pontypridd District Herald |
Description | Pontypridd District Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/9bc902fc-d174-4149-8395-bbdee46e4309 |
Title | Poole Telegram |
Description | Poole Telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/cd062f62-e184-4013-bcba-40e959eba4ac |
Title | Potteries Examiner |
Description | Potteries Examiner (1871 - 1881) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/2b1174ea-5c32-4acb-a596-254a16f7b54e |
Title | Shropshire Examiner |
Description | Shropshire Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://kew.iro.bl.uk/concern/datasets/a52d650d-4a54-40b0-b580-cb19ac5aa744 |
Title | South Staffordshire Examiner |
Description | South Staffordshire Examiner (1874) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/c500f18c-0d24-4cb4-9c3e-1865d5d89e04 |
Title | St. Helens Examiner |
Description | St. Helens Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/01932735-62a4-4846-94e1-496d32838f8f |
Title | Stalybridge Examiner |
Description | Stalybridge Examiner (1876) which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/8b771dc5-4b6b-44f5-8bda-7ed59a5f875d |
Title | Stockton Herald, South Durham and Cleveland Advertiser |
Description | Stockton Herald, South Durham and Cleveland Advertiser. (1858 - 1918) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/c399bc41-c5f6-45c2-b992-38c9d9553cad |
Title | StopsGB: Structured Timeline of Passenger Stations in Great Britain |
Description | Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and subsequently online as a PDF created from an underlying MS Word document), this resource was not well suited for systematic linking to other data. We now present a new, automatically generated dataset that provides the rich detail of this exceptional resource in a structured format. Each station described in the _Chronology_ is given certain attributes, such as operating companies and opening and closing dates, and is georeferenced and linked---whenever possible---to its corresponding entry on Wikidata. We name this structured, linked, and georeferenced dataset 'StopsGB' (Structured Timeline of Passenger Stations in Great Britain), and we make it openly available. We believe this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | This is a new contribution. We expect that this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain. |
URL | https://bl.iro.bl.uk/concern/datasets/0abea1b1-2a43-4422-ba84-39b354c8bb09 |
Title | Stretford and Urmston Examiner |
Description | Stretford and Urmston Examiner. (1879 - 1880) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/93ecb6b9-f982-4ee6-846a-f2dfbd3a5ce6 |
Title | Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' |
Description | Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4034819 |
Title | Supplementary material for 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching' |
Description | Supplementary material for the https://github.com/Living-with-machines/LwM_SIGSPATIAL2020_ToponymMatching repository, containing the underlying code and materials for the paper 'A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching', accepted to SIGSPATIAL2020 as a poster paper. Coll Ardanuy, M., Hosseini, K., McDonough, K., Krause, A., van Strien, D. and Nanni, F. (2020): A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching, SIGSPATIAL: Poster Paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/4034818 |
Title | Supplementary material for 'Station to Station: Linking and Enriching Historical British Railway Data' |
Description | Supplementary material for the station-to-station Github repository, containing the underlying code and materials for the paper 'Station to Station: Linking and Enriching Historical British Railway Data', accepted to CHR2021 (Computational Humanities Research). Mariona Coll Ardanuy, Kaspar Beelen, Jon Lawrence, Katherine McDonough, Federico Nanni, Joshua Rhodes, Giorgia Tolfo, and Daniel C.S. Wilson. "Station to Station: Linking and Enriching Historical British Railway Data." In Computational Humanities Research (CHR2021). 2021. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/5520882 |
Title | Swansea Journal and South Wales Liberal |
Description | Swansea Journal and South Wales Liberal was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/b5e6823b-0b68-4b2b-a1a6-4d3e558cb1eb |
Title | Swansea and Glamorgan Herald, and South Wales Free Press |
Description | Swansea and Glamorgan Herald, and South Wales Free Press. (1847 - 1890) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/76b247db-9bb0-44ea-90bd-50770def196a |
Title | Tamworth Miners' Examiner and Working Men's Journal |
Description | Tamworth Miners' Examiner and Working Men's Journal (1873 - 1876) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/a71d3774-e548-4a36-9a55-b4e22e0d6761 |
Title | The Blackpool Gazette & Herald |
Description | The Blackpool Gazette & Herald (1874 - 1919) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. All but one of these datasets is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will be resolved and all the datasets will be available by the end of March 2023. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/e385dfcb-0310-44f5-b84a-c714e6464324 |
Title | The Cannock Chase Examiner |
Description | The Cannock Chase Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/e5c83b89-dc98-4b08-84a8-ddba4a842f8e |
Title | The Newspaper Press Directory (1846-1880) |
Description | Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in district; title, price, day of publication; politics; date of first issue; political leanings and special interests; proprietors and publishers. Some overseas titles information also included in selected years. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/020c22c4-d1ee-4fca-bf75-0420fe59347a |
Title | The Newspaper Press Directory (1846-1920) - enriched and structured version |
Description | Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel Spreadsheet). Each row describes a newspaper at a specific point in time. We record title, politics, price, location and other information. Please consult the data card for a detailed overview of the data structure for more background on the digitisation process. |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | - This data underpins previous and future research papers which apply the Environmental Scan method to historical newspaper collection - Also used in many of the digital residences - In general, a useful research for media historians and those looking for information about the 19th century press |
URL | https://bl.iro.bl.uk/concern/datasets/adcef12a-bb3d-40d9-871d-5784022a77e8 |
Title | The Newspaper Press Directory (1881-1920) |
Description | Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in district; title, price, day of publication; politics; date of first issue; political leanings and special interests; proprietors and publishers. Some overseas titles information also included in selected years. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/e943d0e1-59dd-48b5-9a0b-dc7723b30749 |
Title | The Runcorn Examiner |
Description | The Runcorn Examiner (1870-1954) was a weekly newspaper and years 1870-1920 have been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/f4761b24-54b6-413c-8acb-92cef09866fb |
Title | The Stockton Examiner |
Description | The Stockton Examiner (1878-1879) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/ce3731c3-a998-4054-a761-6c68e6c1a626 |
Title | Warrington Examiner |
Description | Warrington Examiner (1869-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/2e4efac2-e341-467b-86e9-4269ec07c474 |
Title | Warwickshire Herald |
Description | Warwickshire Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines project |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/29a92211-d26e-49d6-9fd0-e6854768cb86 |
Title | Weekly Journal |
Description | The file consists of the OCR (Optical Character Recognition) text in XML format for one year of Weekly Journal (Hartlepool) 1901. The full digitised newspaper comprises no. 1-407 (29 Nov.1901 - 17 Sep.1909). The digitised page images are available on the British Newspaper Archive website, https://www.britishnewspaperarchive.co.uk/titles/weekly-journal-hartlepool The British Newspaper Archive pages images are behind a paywall, but from March 2021 the paywall will be lifted and some of these images will be free to view. The newspaper continued beyond 1901 but this has not been included in this dataset due to copyright considerations. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/7401d41e-0f67-407d-8f2b-e3f8ba02b7f5 |
Title | Weymouth Telegram |
Description | Weymouth Telegram (1860 - 1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/85d1ab3b-2902-41ba-91db-4cde128e181a |
Title | Widnes Examiner |
Description | Widnes Examiner (1876-1920) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://bl.iro.bl.uk/concern/datasets/027b1e1a-b8bb-4160-99f3-41bca7ba2377 |
Description | Collaboration with the Estonian War Museum on a Europeana-funded project |
Organisation | Europeana |
Country | Netherlands |
Sector | Public |
PI Contribution | I was invited to be a named researcher on a bid by the Estonian War Museum to run a workshop and pilot mini-crowdsourcing projects, funded by Europeana. I contributed to their survey design, and devised and ran 6 structured sessions within a 2 day workshop, designed to take organisations through the processes involved in planning a successful crowdsourcing project. The workshops included prompts for discussion across many departments and disciplines within an organisation, and concluded with group presentations of the ideas developed through the workshops. |
Collaborator Contribution | The project "Crowdsourcing for military heritage in Estonia" is funded by 9925 Euros on a period of January to June 2022. The Estonian War Museum leads the project, organising the workshops, running the survey and reporting on the results, and monitoring the five projects as they develop to June 2022. |
Impact | The final outputs will be five small-scale crowdsourcing projects by Estonian museums, a survey, and publications on the lessons the institutions running them learned from the research project. |
Start Year | 2021 |
Description | Digital Residency of Jennifer Hayward and team: Unlocking the Past: Structured Data Extraction in 19th Century Chilean Newspapers |
Organisation | Adolfo Ibáñez University |
Country | Chile |
Sector | Academic/University |
PI Contribution | Provided funding, data and mentoring |
Collaborator Contribution | Delivered a project on digitised Chilean newspapers, outlined here: https://livingwithmachines.ac.uk/lwm-digital-residency-unlocking-the-past-structured-data-extraction-in-19th-century-chilean-newspapers/ |
Impact | The project was delivered and will continue to inform future work. Formal report was submitted and will appear in due course on the BL Research Repository (https://bl.iro.bl.uk/) |
Start Year | 2023 |
Description | Digital Residency of Joanne Shepard: Archiving The Railway UK (AR-UK) |
Organisation | Durham University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We provided funding, data (STOPSGB dataset), and mentoring for this digital residency. |
Collaborator Contribution | Shepard developed the Archiving The Railway UK (AR-UK) project, reported on here: https://livingwithmachines.ac.uk/lwm-digital-residency-archiving-the-railway-uk-ar-uk/ |
Impact | The official project report will be available on the BL's research repository in due course. https://bl.iro.bl.uk/ |
Start Year | 2023 |
Description | Humphrey Southall (Vision of Britain) |
Organisation | University of Southampton |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Reuse of data and citation. |
Collaborator Contribution | Data sets shared in addition to those available for download on the Vision of Britain site, including a simplified data table. |
Impact | Data sharing. |
Start Year | 2019 |
Description | Living with Machines and Find My Past |
Organisation | Findmypast |
Country | United Kingdom |
Sector | Private |
PI Contribution | We will be sharing the methods and outcomes of our research on this data, for example OCR correction, and toponym resolution. |
Collaborator Contribution | FMP has shared newspaper data with Living with Machines for two counties (Lancashire and Dorset), and in the near future will be sharing all newspapers from Britain dating 1780-1920 that were digitised by FMP for the British Newspaper Archive. A member of FMP also sits on Living with Machines' Advisory Board. |
Impact | Findmypast has provided samples of the British Library's digitised Newspaper Collection and have advised us through their membership on Living with Machines Advisory Board. There are prospects of working together on OCR correction following the ingestion of other incoming full data-sets from the same collection. |
Start Year | 2018 |
Description | Living with Machines and National Library of Scotland |
Organisation | National Library of Scotland |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Living with Machines initiated contact with Chris Fleet, map curator at the NLS to investigate access to their digitized map collections. K. McDonough and O.Vane have worked closely with Fleet over the last 9 months to share and evaluate the digital map holdings. We organized a workshop (June 2019) at the Turing/BL with Chris and other historical maps experts to explore best practices in working with large collections for a digital humanities project. We have shared back reflections and code for enriching the collection metadata, visualizing the collections, and have also developed a close working relationship that will continue to grow (through the sharing of additional maps and metadata as well as collaborative research into other ways of sharing digital map data to researchers through IIIF). |
Collaborator Contribution | NLS Maps curator Chris Fleet has shared a subset of the 200,000 digitized sheets, to be expanded on in the near future. He has provided extensive advice and support for working with the metadata, accessing versions of the maps as web map tiles, and thinking about the next steps of using these materials in a computational research environment. He has also been immensely helpful in connecting Living with Machines to the small, but growing community of researchers using machine learning methods with maps. |
Impact | Blog posts (Computational Approaches to Ordnance Survey Maps: Finding words in maps, part 2: seeing the results blog post); Talks (Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020; Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020, Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019; Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019; Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019); Workshops (Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020; and Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 ); and Meetings (Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019). |
Start Year | 2019 |
Title | Branching sparklines / line graphs |
Description | This notebook demonstrates the branching design used in Press Picker: an interactive visualisation tool for newspaper metadata at the British Library, created in the Living with Machines project. Press Picker shows the holdings per-year of different UK newspapers at the library, and their different formats. We used branching to communicate newspapers changing their name. Through history, newspapers sometimes change their name multiple times-particularly local papers. For example, The Athletic Reporter in 1886 becomes The Reporter, which in 1888 becomes The Midland Counties Reporter and General Advertiser, which in 1889 becomes The Reporter and General Advertiser, and so on. In the British Library data, a new name is treated as a wholly separate record. Introducing this branching means we bring together data that, to some extent, is referring to the same thing. |
Type Of Technology | Webtool/Application |
Year Produced | 2021 |
Impact | . |
URL | https://observablehq.com/@oliviafvane/branching-sparklines-line-graphs |
Title | DeezyMatch |
Description | DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
URL | https://zenodo.org/record/3983554 |
Title | DeezyMatch |
Description | DeezyMatch: A Flexible Deep Neural Network Approach to Fuzzy String Matching DeezyMatch can be applied for performing the following tasks: Record linkage Candidate selection for entity linking systems Toponym matching |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
URL | https://zenodo.org/record/3983555 |
Title | DiachronicEmb-BigHistData |
Description | Pipeline to preprocess, train, and align diachronic word embeddings from Big Historical Data, and carry out semantic change tasks on them. |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | . |
URL | https://github.com/Living-with-machines/DiachronicEmb-BigHistData |
Title | Living-with-machines/alto2txt |
Description | alto2txt : Extract plain text from newspapers Converts
XML (in
METS 1.8 /
ALTO 1.4 ,
METS 1.3 /
ALTO 1.4 ,
BLN or
UKP format) publications to plaintext articles and generates minimal metadata. Full documentation and demo instructions. Added Added
PyPI version and
MIT license badges to
README.md Added
pytest-cov with default options to assess documentation Added
isort to
.pre-commit-config.yaml to sort import consistency Added
pycln to
.pre-commit-config.yaml to check unused imports Added
pycln configuration to
pyproject.toml Added
alto2txt as a command line script in
pyproject.toml Changed Switch from
Apache v2.0 license to
MIT license, inline with project recommendations. Updated
mypy in
.pre-commit-config.yaml Deprecated Replace
extract_publications_text.py with the
alto2txt
command line interface script specified in
pyproject.toml Removed
setup.py
requirements.txt Fixed Fixed
python = ">3.6.0" in
pyproject.toml rather than
>3.7 for consistency with documentation Fixed licensing ambiguity (now all should be
MIT ) Fixed typos in
README.md Fixed surperflous imports via
pycln in
pre-commit |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
URL | https://zenodo.org/record/7378349 |
Title | Living-with-machines/alto2txt |
Description | alto2txt : Extract plain text from newspapers Converts
XML (in
METS 1.8 /
ALTO 1.4 ,
METS 1.3 /
ALTO 1.4 ,
BLN or
UKP format) publications to plaintext articles and generates minimal metadata. Full documentation and demo instructions. Added Added
PyPI version and
MIT license badges to
README.md Added
pytest-cov with default options to assess documentation Added
isort to
.pre-commit-config.yaml to sort import consistency Added
pycln to
.pre-commit-config.yaml to check unused imports Added
pycln configuration to
pyproject.toml Added
alto2txt as a command line script in
pyproject.toml Changed Switch from
Apache v2.0 license to
MIT license, inline with project recommendations. Updated
mypy in
.pre-commit-config.yaml Deprecated Replace
extract_publications_text.py with the
alto2txt
command line interface script specified in
pyproject.toml Removed
setup.py
requirements.txt Fixed Fixed
python = ">3.6.0" in
pyproject.toml rather than
>3.7 for consistency with documentation Fixed licensing ambiguity (now all should be
MIT ) Fixed typos in
README.md Fixed surperflous imports via
pycln in
pre-commit |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | Software supporting the digital research infrastructure for working with digitised records such as historic newspapers. Of use to GLAM sector institutions and researchers alike |
URL | https://zenodo.org/record/7378350 |
Title | Living-with-machines/hmd_newspaper_dl: Initial release |
Description | This release is for a version of the code which works with the current version of the British Library Research Repository What's Changed update code to support new BL repo by @davanstrien in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/4 Bump addressable from 2.7.0 to 2.8.0 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Bump rexml from 3.2.4 to 3.2.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/7 Bump nokogiri from 1.11.0 to 1.12.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/6 New Contributors @dependabot made their first contribution in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Full Changelog: https://github.com/Living-with-machines/hmd_newspaper_dl/compare/v0.0.1...v0.0.2 |
Type Of Technology | Software |
Year Produced | 2021 |
URL | https://zenodo.org/record/5571790 |
Title | Living-with-machines/hmd_newspaper_dl: Initial release |
Description | This release is for a version of the code which works with the current version of the British Library Research Repository What's Changed update code to support new BL repo by @davanstrien in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/4 Bump addressable from 2.7.0 to 2.8.0 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Bump rexml from 3.2.4 to 3.2.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/7 Bump nokogiri from 1.11.0 to 1.12.5 in /docs by @dependabot in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/6 New Contributors @dependabot made their first contribution in https://github.com/Living-with-machines/hmd_newspaper_dl/pull/5 Full Changelog: https://github.com/Living-with-machines/hmd_newspaper_dl/compare/v0.0.1...v0.0.2 |
Type Of Technology | Software |
Year Produced | 2021 |
Impact | Code for bulk downloading newspaper datasets |
URL | https://zenodo.org/record/5571839 |
Title | Living-with-machines/image-search: workshop materials |
Description | What's Changed workshop materials for
hack and yack created by @davanstrien in https://github.com/Living-with-machines/image-search/pull/1 |
Type Of Technology | Software |
Year Produced | 2022 |
URL | https://zenodo.org/record/6473464 |
Title | Living-with-machines/nnanno: 0.0.2 |
Description | This release adds
nnanno to
PyPI |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
URL | https://zenodo.org/record/5537184 |
Title | MapReader |
Description | MapReader is a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | Further applications in a new domain: the Turing project Scivision is applying MapReader in a plant phenotyping task. |
URL | https://github.com/Living-with-machines/MapReader |
Title | Notebook: Prepare Zooniverse Data for Analysis and Deposit |
Description | This Jupyter Notebook, written in Python, combines Zooniverse classification and subject files into a single CSV with redacted usernames and identifying information. It can be opened directly in Colab from the page. It is a November 2023 update to the original tutorial to support a release of Zooniverse data. Part of a collection of Jupyter Notebooks for processing Zooniverse classification and subject files created for the British Library's Digital Scholarship Training Programme by the Living with Machines project's British Library team. |
Type Of Technology | Software |
Year Produced | 2023 |
Open Source License? | Yes |
URL | https://zenodo.org/doi/10.5281/zenodo.10392953 |
Title | Observable notebook 'Heatmap for polygons' |
Description | JavaScript Observable code notebook demonstrating a geospatial visualisation technique: "Visualise overlaps in a large polygon dataset: colourise-alpha using WebGL shaders + PIXI.js". The code notebook demonstrates the technique on historical maps data from National Library of Scotland. |
Type Of Technology | Webtool/Application |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | Thanks from National Library of Scotland, whose data it is demonstrated on, who described the code as "really interesting and useful" for them. |
URL | https://observablehq.com/@oliviafvane/heatmap-for-polygons |
Title | Press Picker: An interactive visualisation tool for newspaper metadata |
Description | Press Picker was created to help select British Library newspaper titles for digitisation. Read more about the context in this blog post and see an interactive demo in this post. The tool provides an overview of newspaper holdings over time, their different formats (hardcopy or microfilm), and the relationship between titles connected by name changes. Titles can be selected within the interface and their data exported. We are sharing the code for reuse. Press Picker consists of two Python Jupyter notebooks. |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | Inquiry about reuse from Berlin State Library (Staatsbibliothek zu Berlin) |
URL | https://github.com/Living-with-machines/PressPicker_public |
Title | alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo |
Description | Small version bump with updated linguistic processing notebooks. |
Type Of Technology | Software |
Year Produced | 2020 |
URL | https://zenodo.org/record/3610375 |
Title | alan-turing-institute/lwm_ARTIDIGH_2020_OCR_impact_downstream_NLP_tasks: ARTIDIGH Zenodo |
Description | Small version bump with updated linguistic processing notebooks. |
Type Of Technology | Software |
Year Produced | 2020 |
URL | https://zenodo.org/record/3611200 |
Title | davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 |
Description | An introduction to computer vision for working with maps: workshop at DHN 2020 |
Type Of Technology | Software |
Year Produced | 2020 |
URL | https://zenodo.org/record/4106323 |
Title | davanstrien/computer-vision-DHNoridic-2020-workshop 0.1 |
Description | An introduction to computer vision for working with maps: workshop at DHN 2020 |
Type Of Technology | Software |
Year Produced | 2020 |
URL | https://zenodo.org/record/4106322 |
Title | deduplify - author Sarah Gibson |
Description | deduplify is a Python command line tool that will search a directory tree for duplicated files and optionally remove them. It generates an MD5 hash for each file recursively under a target directory and identifies the filepaths that generate unique and duplicated hashes. When deleting duplicated files, it deletes those deepest in the directory tree first leaving the last present. |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | The deduplify tool enables the deduplication of file records in messy datasets and has been used within the process of wrangling the JISC1 & JISC2 newspaper datasets into a form amenable to further processing. |
URL | https://github.com/Living-with-machines/deduplify |
Title | defoe, the spark-based for analysing historical datasets |
Description | This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations. The proposed tool has been successfully evaluated using five different large-scale historical text datasets and two HPC environments, as well as on desktops. Results shows that defoe allows researchers to query multiple datasets in parallel from a single command-line interface and in a consistent way, without any HPC environment-specific requirements. |
Type Of Technology | Software |
Year Produced | 2019 |
Impact | Originally developed by UCL and the British Library (funded by Jisc, 2015) then UCL (funded by 2016-2018), defoe was refactored and extended by EPCC, The University of Edinburgh for both Alan Turing Institute funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020) |
URL | https://github.com/alan-turing-institute/defoe |
Title | defoe_visualization, a collection of notebooks for analysing further the results obtained by defoe |
Description | defoe_visualization is a repository of Jupyter notebooks which complements the defoe scalable and portable digital eScience toolbox for historical research. These notebooks allow researchers to explore query results from defoe and to post-process the results to reveal new insights into the historical data processed by defoe. The notebooks are complemented with sample data files with the query results produced by the authors. |
Type Of Technology | Software |
Year Produced | 2019 |
Impact | Developed by EPCC, The University of Edinburgh in conjunction with: the Alan Turing Institute (2018-2019) funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Program; the College of Arts Humanities and Social Sciences, The University of Edinburgh (2019-2020) as part of the Data Driven Innovation Programme funded by the Edinburgh and South-East Scotland City Region Deal); and Living with Machines (2019-2020). |
URL | https://github.com/alan-turing-institute/defoe_visualization |
Title | flyswot |
Description | flyswot is a Command Line Tool that supports British Library staff in processing 'legacy' digitised content using computer vision. Flyswot is a command-line tool that can be run across images in a directory to check for incorrect metadata. Flyswot has the following features UNIX style search patterns for matching images to predict against produces a CSV output containing the paths to the input images, the predicted label and the models confidence for that prediction. produces a summary 'report' providing a high-level summary of the predictions made by flyswot automatically downloads the latest available flyswot model |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | The British Library holds a large amount of 'legacy' digitised material (~1 Petabyte). Some of these images have previously assigned uncorrected metadata as the result of limitations in a legacy digitised image platform. In particular images of manuscript pages were given the label 'flysheet' when other available labels weren't available. As a result, many images are falsely labelled as 'flysheets'. As part of the move to a new digital library system, there is a desire to correct this metadata. The scale of this problem makes fully manual intervention challenging. Flyswot, and the associated machine learning models, were developed in collaboration with the Heritage Made Digital team within the library to support library staff in processing this material. Flyswot is actively being used in this workflow and is helping speed up the process of checking images and helping assess the required work in processing collections. Beyond this, flyswot has also identified collection items that didn't have pagination and as a result curators have intervened not only in digital collections but also with the physical items. |
URL | https://github.com/davanstrien/flyswot |
Title | jisc-wrangler - author Timothy Hobson |
Description | jisc-wrangler is a Python tool written specifically to restructure and deduplicate XML files containing OCR content from the JISC 1 & JISC 2 newspaper dataset. It outputs a canonical file structure and filename convention amenable to further processing with the alto2txt tool. |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | This tool makes the JISC 1 & JISC 2 newspaper datasets accessible to the research project by cleaning, deduplicating and standardising the directory structure and filenames. It performs an essential pre-processing step that unlocks the potential of this open-access dataset. |
URL | https://github.com/Living-with-machines/jisc-wrangler |
Title | subsamplr - author Timothy Hobson |
Description | subsamplr is a Python tool for representative subsampling from a population. It was designed for sampling from a large collection of digital newspapers, but is a generic tool that could be applied in any context in which metadata is available for a population and a subsample is desired. Any features in the metadata can be used as dimensions for subsampling. The tool is configurable to connect to a metadata database and includes example Jupyter notebooks. |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | Many avenues of research within the Living with Machines project target historic newspaper data, and given the volume of data available, the first step is typically to sample from the various newspaper collections to produce an accessible subset of data on which research methodologies can be developed and tested. The subsamplr tool is designed for precisely this purpose and is therefore an important component in the research workflow across a wide variety of investigations in the project. It enables researchers to specify subsampling parameters from which data samples are (reproducibly) generated that satisfy the requirements of the particular research question at hand. |
URL | https://github.com/Living-with-machines/subsamplr |
Description | "How we collaborate" blog post series |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Blog post series reflecting on our experience of collaborating on the project. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/category/how-we-collaborate/ |
Description | "Introducing the Language Lab" blog post |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Blogpost introducing the language lab, which explored the social and cultural impact of the Industrial Revolution as reported in newspapers and other types of textual sources. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/introducing-the-language-lab/ |
Description | "Introducing..." blog post series |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | We published a series of blog posts introducing each member of the Living with Machines team |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/category/the-team/ |
Description | 'Data visualisation for cultural heritage collections' course at N8 Centre of Excellence in Computationally Intensive Research |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | Olivia Vane delivered a two-part workshop on data visualisation for Digital Humanities. Split over two sessions, the workshops gave an overview of the key concepts in data visualisation, before moving to tackle more practical exercises in the second week. |
Year(s) Of Engagement Activity | 2021 |
URL | https://n8cir.org.uk/events/data-visualisation-hums/ |
Description | 'HISTORICAL RESEARCH IN THE DIGITAL AGE', PART 4: 'RESEARCHING WITH BIG DATA; AND HOW HISTORIANS CAN WORK COLLABORATIVELY' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Ruth Ahnert, Professor of Literary History & Digital Humanities at QMUL, considers in this blogpost how historians can work with big data, with reference to the need for and approaches to interdisciplinary collaboration. Ruth draws on her experience of leading Living with Machines, an interdisciplinary project bringing together historians and data scientists, and based at the British Library and Alan Turing Institute. Ruth and fellow researchers describe the project - and the opportunities and challenges of interdisciplinary working - in their new book, Collaborative Historical Research in the Age of Big Data, published by Cambridge University Press and freely available Open Access. |
Year(s) Of Engagement Activity | 2023 |
URL | https://blog.royalhistsoc.org/2023/02/07/historical-research-in-the-digital-age-part-4/ |
Description | 124 Introduction to OCR and HTR |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Daniel van Strien presented as part of a British Library staff training workshop on OCR (Optical Character Reccongition) |
Year(s) Of Engagement Activity | 2020 |
Description | A talk or presentation - D. Bevan "Building ethical frameworks to balance risk and innovation" invited panel at Research Panel |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited talk at panel as part of two day workshop |
Year(s) Of Engagement Activity | 2020 |
URL | https://ei4ai.wordpress.com/workshop/ |
Description | ACH talk: Bridging humanities: embedding public participation in a collaborative research project |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A talk for the Association for Computers and the Humanities ACH2021 conference in July 2021, presented by Mia and based on her work with Barbara McGillivray, Giorgia Tolfo, Emma Griffin and others in the project. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati... |
Description | AI and ethics panel discussion for Leeds Digital Festival at Leeds City Museum |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | AI is coming, so how do we live and work with it? What can we all do to develop ethical approaches to AI to help ensure a more equal and just society? Co-Investigators Maja Maricevic and Mia Ridge organised a public panel discussion to address questions around the ethics of AI, building on the issues explored in the Living with Machines exhibition. Hosted by Leeds City Museum and timed for inclusion in the Leeds Digital Festival, the event was held on September 29, 2022. The event blurb was: What can we all do to develop ethical approaches to AI to help ensure a more equal and just society? AI is all around us - it's in our phones, our social networks, job and credit applications and more. AI is increasingly used to make complex judgements, and process information in creative ways that previously seemed unique to humans. But what about areas that require empathy and emotional intelligence? Some uses of AI have very serious implications and require our full attention - especially when it comes to making decisions that can affect our values. AI advances will continue to disrupt our lives. How will we live and work with machines in the future? What should our relationship with AI look like? How do we ensure that AI systems work for all humans, and not just those implementing them? Join us from 5:30pm for an exciting and thought-provoking conversation with our expert panel on the ethics of AI. Hear from our panel of experts including: Chair - Timandra Harkness Sherin Mathew - Founder & CEO of AI Tech UK Robbie Stamp - CEO at Bioss International and author Keely Crockett - Professor in Computational Intelligence, Manchester Metropolitan University Andrew Dyson, Global Co-Chair of DLA Piper's Data Protection, Privacy and Security Group You'll have a chance to ask questions in the Q&A, then mingle with other attendees over drinks. This panel and related workshop are organised by the British Library and Alan Turing Institutes' Living with Machines project, in partnership with Ai Tech North UK. |
Year(s) Of Engagement Activity | 2022 |
URL | https://blogs.bl.uk/digital-scholarship/2022/09/learn-more-about-living-with-machines-at-our-events.... |
Description | AI and the creative industries panel discussion for Leeds Digital Festival at Leeds City Museum |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | How will AI change what we wear, the TV and films we watch, what we read? Co-Investigators Maja Maricevic and Mia Ridge organised a public panel discussion to address questions such as those posed in a related blog post: 'some uses of AI have very serious implications and require our full attention - especially when it comes to designing and using AI in ways that integrate a human-centred set of values. How we entertain ourselves with the aid of technology and AI has been on our radars for a while. But what about cultural and linguistic diversity, equality and equity of opportunity, fairness, enrichment of our life experiences, emotional growth and better mental health, stronger connection and understanding of others, the opportunity to learn and acquire new knowledge, delight in beauty? Can we envisage AI being part of these things?' Hosted by Leeds City Museum and timed for inclusion in the Leeds Digital Festival, the event was held on September 22, 2022. Event blurb: Join us in person to hear from a panel of experts about the use of AI in fashion, beauty, broadcasting, arts, and heritage, and how this might impact our lives, and possibly our identity in the years to come. There's an exponential rise of AI in our everyday lives - from the use of our data by social media, to the algorithms working out what we buy, how we vote and what we eat. AI is also increasingly underpinning the cultural and creative sphere of our lives. This creates new and exciting opportunities, but also brings new challenges. Join us from 5:30pm for an exciting and thought-provoking conversation with our expert panel. Our amazing panellists include: Chair: Zillah Watson, independent consultant, ex-BBC Rebecca O'Higgins - Founder KI-AH-NA Laura Ellis, Head of Technology Forecasting, BBC Maja Maricevic, Head of Higher Education and Science, British Library - libraries and heritage You'll have a chance to ask questions in the Q&A, then mingle with other attendees over drinks. The panel is organised by the British Library and Alan Turing Institutes' Living with Machines project, in partnership with Ai Tech North UK. |
Year(s) Of Engagement Activity | 2022 |
URL | https://livingwithmachines.ac.uk/the-role-of-ai-in-creative-and-cultural-industries/ |
Description | AI4LAM presentation: AI training resources for GLAM |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation introducing an "AI training resources for GLAM review" document. The presentation took place as part of an AI4LAM community call (https://sites.google.com/view/ai4lam) |
Year(s) Of Engagement Activity | 2021 |
URL | https://docs.google.com/document/d/1l4KFhAX1nijBUmE5Srfcq2ELFvrYbm8fp3jaszsmiAE/edit?usp=sharing |
Description | An introduction to computer vision for working with digitised heritage collections (workshop) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A workshop with around ~50 participants introducing deep learning-based computer vision methods to digital humanities researchers and heritage professionals. |
Year(s) Of Engagement Activity | 2020 |
URL | https://github.com/Living-with-machines/computer-vision-DHNordic-2020-workshop |
Description | Andre Piza presented at "Future of Journalism" to Open Society Foundation Journalism Programme |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation about Living with Machines project started dialogue with BBC News Labs and Open Society leading to talk from BBC News Labs Executive Product Manager (David CAswell) at the Alan Turing Institute and visit from Open Society's Independent Journalism Senior Programme Specialist (Shuwei Fang). Opportunities for collaboration with LWM are now being explored with BBC News Labs. |
Year(s) Of Engagement Activity | 2019 |
Description | Annotation session with the British Library staff, 2 August 2019, organised by Daniel van Strien, Mariona Coll Ardanuy, and Mia Ridge |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | We had an open annotation session in which we invited British Library staff members to help with our experiments. We planned four different linguistic annotation tasks (named entity recognition, recognition of machines, entity linking to Wikipedia, and semantic role labeling) on newspaper articles from the nineteenth century. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/ |
Description | Article on History First |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Co-I Mia Ridge was interviewed by journalist Mark Bridge for a piece that was posted in September 2022, 'Tools from £9.2m Industrial Revolution project will uncover hidden stories'. The story featured project work with computer vision and maps, crowdsourcing and 'rail space', and mentioned the project's GitHub repository, website and exhibition. It concluded with a focus on making our work relevant and accessible to community historians and the GLAM sector. |
Year(s) Of Engagement Activity | 2022 |
URL | https://historyfirst.com/tools-from-9-2m-industrial-revolution-project-will-uncover-hidden-stories/ |
Description | Association for Computers and the Humanities paper presentation: Bridging humanities: embedding public participation in a collaborative research project |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation for the annual Association for Computers and the Humanities conference that explicitly addressed the challenges of embedding crowdsourcing as a form of public engagement into a 'data science' research project with different conceptions of timelines, metrics for success, etc. |
Year(s) Of Engagement Activity | 2021 |
URL | https://ach2021.ach.org/ |
Description | Beta Test of Library Carpentry Introduction to AI and Machine Learning |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop hosted by LIBER/BNF. Daniel van Strien contributed towards a beta test of a lesson that is currently in the early stages of development and is to become a part of the Library Carpentry Curriculum. |
Year(s) Of Engagement Activity | 2021 |
URL | https://libereurope.eu/mec-events/beta-test-of-library-carpentry-introduction-to-ai-and-machine-lear... |
Description | Blog Post 'Heatmap for polygons: visualise overlaps in a large polygon dataset' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | A technical 'how to' blog post on the Living with Machines website, describing a geospatial visualisation technique. National Library of Scotland (whose data the post demonstrates the technique on) and Registers of Scotland both fed back that the blog post was helpful and interesting. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/heatmap-for-polygons-visualise-overlaps-in-a-large-polygon-dataset/ |
Description | Blog Post 'Press Picker code published' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | "We are very pleased to share the code for 'Press Picker', our interactive data visualisation tool for newspaper metadata: https://github.com/Living-with-machines/PressPicker_public." |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/press-picker-code-published/ |
Description | Blog Post on Sources Lab (Understanding the Victorian Newspaper Landscape) |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Blog post describing the work of the Source Lab on Digitizing and processing the Newspaper Press Directories. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/sources-understanding-the-victorian-newspaper-landscape/ |
Description | Blog post 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | Blog post on the Living with Machines website: 'Press Picker: visualising formats and title name changes in the British Library's newspaper holdings'. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title-name-changes-in-the-brit... |
Description | Blog post: "Finding your way among newspapers" |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Blog post "Finding your way among newspapers" on how to select newspapers for digitisation at the British Library. |
Year(s) Of Engagement Activity | 2020 |
URL | http://livingwithmachines.ac.uk/finding-your-way-among-newspapers/ |
Description | Blog post: 'Platforms for People-Powered Research' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post highlighting contributions to a conference and sharing a video from the panel discussion. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/platforms-for-people-powered-research/ |
Description | Blog post: Ad or not? New crowdsourcing task |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post describing a new crowdsourcing task that aimed to make data from a previous task easier to analyse by classifying articles as being advertisements or not. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/ad-or-not-new-crowdsourcing-task/ |
Description | Blog post: Bridging humanities: embedding public participation in a collaborative research project |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post highlighting our contribution to a panel at the Association for Computing in the Humanities conference. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/bridging-humanities-embedding-public-participation-in-a-collaborati... |
Description | Blog post: Exploring ideas for our Living with Machines exhibition |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post setting out exhibition themes and introducing our collaboration with Leeds Museums and Galleries. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/exploring-ideas-for-our-living-with-machines-exhibition/ |
Description | Blog post: First crowdsourced datasets available |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A post in support of the first open data release from crowdsourcing activities on the project, linking to the British Library's research repository. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/first-crowdsourced-datasets-available/ |
Description | Blog post: From prams to Parliament - what was a machine? Help us find out |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/from-prams-to-parliament-what-was-a-machine-help-us-find-out/ |
Description | Blog post: Highlights from crowdsourcing projects at the British Library |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The post provided progress reports on a range of crowdsourcing projects at the British Library, including the Zooniverse tasks created through Living with Machines. |
Year(s) Of Engagement Activity | 2020 |
URL | https://blogs.bl.uk/digital-scholarship/2020/12/highlights-from-crowdsourcing-projects-at-the-britis... |
Description | Blog post: Learning from Zooniverse volunteers to improve crowdsourcing projects |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post that describes how feedback from volunteers led to improvements in our crowdsourcing task launched in December 2020. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/learning-from-zooniverse-volunteers-to-improve-crowdsourcing-projec... |
Description | Blog post: Sharing our Delivery Plan |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The post celebrated the deposition of our 2019 Delivery Plan in the British Library's repository. Sharing it was part of our commitment to transparency, and to sharing our lessons learnt as we ourselves learn them. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/sharing-our-delivery-plan/ |
Description | Blog post: The role of AI in creative and cultural industries |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post by Co-Investigator Maja Maricevic in support of our events programme in Leeds in September 2022. The post provided background for the events and set out some of the questions our panellists were set to explore. |
Year(s) Of Engagement Activity | 2022 |
URL | https://livingwithmachines.ac.uk/the-role-of-ai-in-creative-and-cultural-industries/ |
Description | Blog post: What does a 'digital humanities research software engineer' do? |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A conversation between Mia and Olivia Vane designed to broaden the reach and demonstrate the range of experience, skills and job titles relevant to our job advertisement replacing Olivia as DH RSE. When we interviewed for the post, we learnt that this post was pivotal in the successful applicant deciding to apply for the role. |
Year(s) Of Engagement Activity | 2021 |
URL | https://livingwithmachines.ac.uk/what-does-a-digital-humanities-research-software-engineer-do/ |
Description | Blog post: What is a 'machine' anyway? Help us describe them |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post in support of the Comms launch for novel crowdsourcing tasks designed in collaboration with historians, computational linguists and others on the Living with Machines project. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/what-is-a-machine-anyway-help-us-find-out/ |
Description | British Library Open House Session at Boston Spa |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library |
Year(s) Of Engagement Activity | 2020 |
Description | British Library Open House Session at King's Cross St. Pancras |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | The Library's Living with Machines team provides an update on this collaborative project, with updates on the ways in which its work with data science and digitised collections benefits the Library |
Year(s) Of Engagement Activity | 2020 |
Description | British Library Show and Tell Session at King's Cross St. Pancras |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | An interactive poster session about the various tasks and outcomes of the Projects Labs, attended by staff across the British Library and Alan Turing Institute. |
Year(s) Of Engagement Activity | 2019 |
Description | British Library project web page |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Created a project web page on the British Library website to provide official visible information about the project in support of our other engagement activities. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.bl.uk/projects/collective-wisdom |
Description | Cambridge GLAM Digital champions lightning talk "The Living with machines project" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | I presented the Living with machines project to an audience of librarians and other professionals from the GLAM (Galleries, Libraries, Archives, Museums) sector. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.eventbrite.co.uk/e/glam-digital-champions-digital-lunch-january-2020-tickets-89946158381... |
Description | Case Study OED API: Exploring word meaning in historical texts with computational methods |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post for the OED case studies series |
Year(s) Of Engagement Activity | 2021 |
URL | https://public.oed.com/blog/case-study-oed-api/ |
Description | Catching up with maps |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post on the Living with Machines website to provide a high-level update on the maps-related work in the project. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/catching-up-with-maps/ |
Description | Code and Coffee ?? |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post describing an internal project activity aimed at facilitating collaboration |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/code-and-coffee/ |
Description | Collecting annotations from British Library staff |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post outlining an event held with British Library staff |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/collecting-annotations-from-british-library-staff/ |
Description | Computational Approaches to Ordnance Survey Maps blog post |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This blog post introduces the preliminary work of the "Space & Time Lab" in Living with Machines, which experimented with computer vision methods for studying large sets of historical, digitized maps. With 179 page views, it generated several conversations with external researchers about our use of these methods in the humanities context. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/introducing-the-space-and-time-lab/ |
Description | Computer Vision for Digital Heritage SIG |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post on the Living with Machines website announcing the new Computer Vision for Digital Heritage SIG. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/computer-vision-for-digital-heritage/ |
Description | Computer Vision for the Humanities workshop (Warwick University) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | This workshop aims to provide an introduction to computer vision aimed for humanities applications. In particular this workshop focuses on providing a high level overivew of machine learning based approaches to computer vision focusing on supervised learning. The workshop includes discussion on working with historical data. The materials are based on in progress Programming Historian lessons. |
Year(s) Of Engagement Activity | 2021 |
URL | https://zenodo.org/record/4746493 |
Description | Conference Roundtable: The Future of Spatial History for Spatial Humanities 2021/DHangouts |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Roundtable discussion on the future of spatial history with K. McDonough, J. Taylor, and L. Scholz, chaired by I. Gregory for the Spatial Humanities 2021 conference and presented as part of the DHangout series hosted by Lancaster University. Audience of about 35 people with conversation about the future of computational spatial historical research. |
Year(s) Of Engagement Activity | 2021 |
URL | https://youtu.be/60aT8J4hMAA |
Description | Convening the Applied Data Analysis strand at the Digital Humanities at Oxford Summer School (July 2023) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | This strand teaches how to manipulate, analyse and explore data from the Humanities and the cultural sector. It is aimed at both GLAM professionals and academics, particularly those in the Arts & Humanities. It introduces to both theoretical (descriptive statistics, modelling) and practical aspects (Python data analysis stack) of applied data analysis. |
Year(s) Of Engagement Activity | 2023 |
Description | Convening the Text to Tech strand at the Digital Humanities at Oxford Summer School (July 2023) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | This hands-on workshop offers an introduction to natural language processing in Python, from processing texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. It is aimed at both GLAM professionals and academics, particularly those in the Arts & Humanities. |
Year(s) Of Engagement Activity | 2023 |
URL | https://web.cvent.com/event/58fc430e-5294-4919-a7a3-c2b14f81a059/websitePage:4745b3f6-aba6-4f03-ada6... |
Description | Convening the Text2Tech Strand at the Digital Humanities at Oxford Summer School 2022 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | We organized the Text2Tech at the Digital Humanities Summer School at the University of Oxford. Our week-long course provided an introduction to text mining with Python and was attended by 40 students, most postgraduates or academic staff. |
Year(s) Of Engagement Activity | 2022 |
URL | https://eng.ox.ac.uk/events/dhoxss-2022/ |
Description | Course 107 'Data Visualisation for Cultural Heritage Collections': British Library Digital Scholarship Training Programme |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Third sector organisations |
Results and Impact | In May 2020, Olivia Vane taught the rebooted Course 107 'Data Visualisation for Cultural Heritage Collections' for the British Library Digital Scholarship Training Programme: internal training in digital methods for British Library staff. The course was delivered over 2 sessions (4.5hrs in total) and included presentations and exercises with British Library datasets. It was taught over Zoom + Slack. |
Year(s) Of Engagement Activity | 2020 |
Description | Crowdsourcing tasks 'What's that machine? Describe it!' and 'What's that machine? Classify it!' |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Building on the lessons learnt from earlier experiments, in early December we launched two new crowdsourcing projects with devised in collaboration with researchers including historians and computational linguists. These projects aimed to integrate linguistic research questions with tasks that encouraged volunteers to engage with social and technological history in the pages of 19th century newspapers. As part of the launch process we applied to become an official Zooniverse project, which included separate reviews by Zooniverse staff and volunteers. We tweaked the interfaces as a result, and were delighted to be recognised as an official Zooniverse project. Nearly 10,000 tasks were completed by over 700 registered volunteers (and countless anonymous volunteers) within a week. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.zooniverse.org/projects/bldigital/ |
Description | D3 JavaScript visualisation in a Python Jupyter notebook |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post describing how to combine JavaScript, the visualisation library D3.js and Python Jupyter notebooks. Accompanying notebook code was published with this blogpost. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/d3-javascript-visualisation-in-a-python-jupyter-notebook/ |
Description | Daniel Van Strien: Flyswot: garden-variety machine learning applications conference presentation |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Conference presentation "Flyswot: garden-variety machine learning applications" at the ai4lam conference. Presenters: Daniel van Strien, Digital Curator at the British Library, Andrew Longworth, Digitisation Project Analyst at the British Library, Catherine Cronin, The Heritage Made Digital Team at the British Library |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.bnf.fr/en/program-international-conference-les-futurs-fantastiques-december-8-10-2021 |
Description | Daniel van Strien AI4LAM webinar "In conversation with Jeremy Howard - upskilling to better navigate AI" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Webinar organised by Daniel van strien as part of the AI4LAM teaching and learning working group and the AI4LAM Au/ANZ chapter. The webinar hosted invited speaker Jeremy Howard from fastai to speak on the topic of making matching learning accessible to people working in libraries. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.eventbrite.com/e/in-conversation-with-jeremy-howard-upskilling-to-better-navigate-ai-tic... |
Description | Daniel van Strien The Carpentries: Introduction to AI for GLAM Workshop at AI4LAM conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Deliver of 'The Carpentries: Introduction to AI for GLAM" workshop online as part of the AI4LAM conference. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.bnf.fr/en/agendaEN/workshops-tutorials-les-futurs-fantastiques-3rd-conference-about-arti... |
Description | Daniel van Strien, British Library Digital Digital Scholarship Training program, workshop on computer vision for historical maps, 13 February 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | A workshop held for British Library staff on using Computer Vision methods with heritage data including historic map collections. |
Year(s) Of Engagement Activity | 2020 |
Description | Daniel van Strien, Kaspar Beelen, CREATE Digital History Workshop: Maps-as-Data: Analysing Historical Maps with Computer Vision, Feb 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop on using Computer Vision methods with historical collections held at the Create centre in Amsterdam University. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.create.humanities.uva.nl/events/digital-history-workshop-maps-as-data-analysing-historic... |
Description | Daniel van Strien, Katherine McDonough, Daniel Wilson presented at Victorian Data Conference, University of Virginia, November 15-16, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Three Living with Machines members presented on a session about "Living with Bias" at the Victorian Data conference, the first gathering of nineteenth-century studies scholars using digital methods in their work. Attended by about 100 researchers, our presentation both introduced Living with Machines to this largely US-based audience and generated several connections which have already resulted in visits to the Turing/BL in London in 2020 (including the faculty director of the University of Virginia Scholar's Lab, Alison Booth, who was a co-host of this conference). |
Year(s) Of Engagement Activity | 2019 |
URL | http://data-caucus.herokuapp.com/conference-cfp |
Description | Data Study Group on smart monitoring for conservation areas |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Data Study Groups (DSG) are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Kasra Hosseini and Mariona Coll Ardanuy were the principal investigators of a DSG with the World Wide Fund for Nature (WWF) on "Smart monitoring for conservation areas". The methods explored are closely related to methods directly applicable to Living with Machines datasets. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.turing.ac.uk/research/publications/data-study-group-final-report-wwf |
Description | David Beavan and James Hetherington contributing to Royal Society 'Dynamics of data science skills' |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Policymakers/politicians |
Results and Impact | Contribution to report - see link. |
Year(s) Of Engagement Activity | 2019 |
URL | https://royalsociety.org/topics-policy/projects/dynamics-of-data-science/ |
Description | David Beavan and Lydia France Living with Machines Distributed Conference panel 'AI Beyond STEM: digital skills to unleash the power of data science and AI for all' |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Expert panel session with international guests online to an international audience |
Year(s) Of Engagement Activity | 2023 |
URL | https://livingwithmachines.ac.uk/event/ai-beyond-stem-digital-skills-to-unleash-the-power-of-data-sc... |
Description | David Beavan invited 'floating expert' and Mia Ridge, Dr. Katherine McDonough, Dr. Kaspar Beelen and Dr. Kasra Hosseini (project collaborator) invited participants at Computational Archival Science Workshop: Exploring Data, Investigating Methodologies, The National Archives, 20-21 June 2019 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | About 100 people attended this event where Kaspar Beelen and Katie McDonough presented the keynote lecture on bias in digitized archival collections being used in the Living with Machines project. The international audience included GLAM professions and students from the US, UK, and elsewhere in Europe, and fostered conversations about the role of GLAM institutions in collaborating with researchers to develop best practices for creating, preserving, and making accessible digitised and born digital collections. |
Year(s) Of Engagement Activity | 2020 |
URL | https://blog.nationalarchives.gov.uk/computational-archival-science-cas-exploring-data-investigating... |
Description | David Beavan invited inaugural talk at inaugural Humanities of Festival at University of Georgia, US 'Beyond Digital Humanities: Weaving Humanities Research Software Engineering and AI' |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | 50 staff and students attended talk, good engagement, have hosted UGA staff in Turing and are planning bilateral training for postgrads |
Year(s) Of Engagement Activity | 2023 |
URL | https://willson.uga.edu/public-humanities/uga-humanities-council/2023-uga-humanities-festival/ |
Description | David Beavan invited presentation at Software Development in Digital Humanities Labs and Projects, University of Sussex, 30 July 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | David Beavan invited talk at National library of Scotland Focused tech development delivering enhanced collections data |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | David Beavan invited talk given to National Library of Scotland (NLS) internal professional seminar series |
Year(s) Of Engagement Activity | 2020 |
Description | David Beavan led, Mia Ridge, Barbara McGillivray participated in panel discussion 'Data Science & Digital Humanities: new collaborations, new opportunities and new complexities' at Digital Humanities 2019 conference, Utrecht, July 11, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This panel highlights the emerging collaborations and opportunities between the fields of Digital Humanities (DH), Data Science (DS) and Artificial Intelligence (AI). It charts the enthusiastic progress of the Alan Turing Institute, the UK national institute for data science and artificial intelligence, as it engages with cultural heritage institutions and academics from arts, humanities and social sciences disciplines. We discuss the exciting work and learnings from various new activities, across a number of high-profile institutions. As these initiatives push the intellectual and computational boundaries, the panel considers both the gains, benefits, and complexities encountered. The panel latterly turns towards the future of such interdisciplinary working, considering how DS & DH collaborations can grow, with a view towards a manifesto. As Data Science grows globally, this panel session will stimulate new discussion and direction, to help ensure the fields grow together and arts & humanities remain a strong focus of DS & AI. Also so DH methods and practices continue to benefit from new developments in DS which will enable future research avenues and questions. |
Year(s) Of Engagement Activity | 2019 |
URL | https://dev.clariah.nl/files/dh2019/boa/0364.html |
Description | David Beavan presented at Turing Innovation Symposium, hosted by Accenture, Dublin, 3-4 April 2019. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Overview of Living with Machines for Turing Innovation Showcase in Dublin 2019. |
Year(s) Of Engagement Activity | 2019 |
Description | David Beavan presented talk 'Potential Uses of a Registry of Digitised Works: By scholars' at Global Digitised Dataset Network, British Library, 10 June 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | Lessons from the project on uses of a registry of digitised works |
Year(s) Of Engagement Activity | 2019 |
URL | https://gddnetwork.arts.gla.ac.uk/ |
Description | Deep Learning approaches in GIScience session at the Royal Geographical Society Annual Conference: Maps and Machines presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation about computer vision for maps research at the annual Royal Geographical Society conference. Virtual audience of about 30 people. |
Year(s) Of Engagement Activity | 2021 |
URL | https://sdesabbata.github.io/deep-learning-giscience/ |
Description | Deep learning reading group |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post introducing an internal reading group on deep-learning methods being used by the project. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/deep-learning-reading-group/ |
Description | Developing Data Study Group with TNA on (web) archives and social attitudes towards new technologies, initiated by Barbara McGillivray and David Beavan |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Data Study Groups are intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. Beavan and McGillivray co-organised a DSG with the National Archives on "Discovering topics and trends in the UK Government Web Archive" |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.turing.ac.uk/events/data-study-group-december-2019 |
Description | Diachronic and diatopic word embeddings from British historical newspapers (AIUCD conference) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Poster presentation on ongoing experiments with diachronic and diatopic word embeddings trained on historical British newspaper collections (1830-1889). |
Year(s) Of Engagement Activity | 2023 |
URL | https://doi.org/10.5281/zenodo.7892460 |
Description | Did Machines Drive History? |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post introducing the first minimum research outcome of the language lab, in which we explored to what extent machines were being seen as agents able to drive change. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/did-machines-drive-history/ |
Description | Digital Humanities and Research Software Engineering working together: some examples of a fruitful collaboration from the Living with Machines project |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Federico Nanni and Kasra Hosseini (from the Research Engineering group at the Alan Turing Institute) and Kaspar Beelen and Mariona Coll Ardanuy (postdocs in the Living with Machines project) shared their experience in working together in projects at the intersection of software engineering, computational linguistics and digital humanities, as part of the KQ Codes Technical Socials at University College London. About 20 participants attended. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials |
Description | Digital Humanities at Oxford Summer School Virtual Event 2020 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "From Text to Tech: Text mining and the humanities, using language models to find living machines in nineteenth-century books" at the 2020 virtual edition of Digital Humanities at Oxford Summer School, with 270 attendants. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.dhoxss.net/dhox2020-virtual-event-report |
Description | Digital Humanities at Oxford Summer School Virtual Event 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Interactive workshop on "An introduction to natural language processing with Python", organised by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. Participants learned how to use Python programming for powerful text processing in the Humanities, from cleaning texts to extracting meaning from them, as well as the basics of automated semantic analysis with machine learning. There were 60 attendants. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.dhoxss.net/dhox2020-virtual-event-report |
Description | Digital Humanities at Oxford Summer School Virtual Event 2021 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Interactive workshop on "Language models and their use in the digital humanities", by Mariona Coll Ardanuy, Kaspar Beelen, and Federico Nanni. This workshop offered a basic introduction to language models using python. Participants learned how to use and interpret different language models and to train their own models. There were 16 participants. |
Year(s) Of Engagement Activity | 2021 |
URL | https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school |
Description | Digital Humanities at Oxford Summer School Virtual Event 2021 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Kaspar Beelen, Federico Nanni, and Mariona Coll Ardanuy gave the talk "Models of Language: Using algorithms to explore the past" at the 2021 virtual edition of Digital Humanities at Oxford Summer School. There were 450 participants. |
Year(s) Of Engagement Activity | 2021 |
URL | https://digital.humanities.ox.ac.uk/digital-humanities-oxford-summer-school |
Description | Echoing Through Time: New Tunes for Old Words |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post about the process of recording ballads from the British Library's collections for use in the exhibition, and their subsequent release on Soundcloud. |
Year(s) Of Engagement Activity | 2022 |
URL | https://livingwithmachines.ac.uk/echoing-through-time-new-tunes-for-old-words/ |
Description | Emma Griffin invited presentation: International symposium - 'Dartmouth and the World', Dartmouth University, 10-20 October 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Gave talk on "Life and Living Standards in Britain's Industrial Revolution" |
Year(s) Of Engagement Activity | 2019 |
Description | Emma Griffin invited presentation: Oregon State University, Centre for the Humanities, 7 October 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Talk on "Home Economics: Food, Money, and Emotions in Victorian Britain" |
Year(s) Of Engagement Activity | 2019 |
Description | Engagement focused website, blog or social media channel - Blog post: Turing Researcher Spotlight - David Beavan |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Turing Researcher Spotlight - David Beavan. Senior Research Software Engineer David Beavan is using AI to unlock new insights into the Industrial Revolution. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.turing.ac.uk/people/spotlights/david-beavan |
Description | Finding words in maps, part 2: seeing the results |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog post about evaluating the 'Strabo' tool (software for transcribing text in digitised historical maps) on our map data through visualisation. |
Year(s) Of Engagement Activity | 2019 |
URL | https://livingwithmachines.ac.uk/finding-words-in-maps-part-2-seeing-the-results/ |
Description | Free Thinking: Archiving, curating and digging for data |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | BBC Radio 3 broadcast: What stories are being uncovered by people working behind the scenes at museums and institutions? Lisa Mullen finds out talking to Tessa Jackson - Conservator; David Beavan - Senior Research Software Engineer, Turing Institute and Matt Harle - Archivist and curator at the Barbican. Barbara Hepworth: Art & Life runs at the Hepworth Wakefield from 21 May 2021 to 27 Feb 2022. The gallery also runs a Hepworth Research Network in partnership with the Department of History of Art at the University of York and the School of Art, Design and Architecture at the University of Huddersfield. https://hepworthwakefield.org/our-story/hepworth-research-network/people/ Matthew Harle is an archivist working with the Barbican as it prepares for its 40th anniversary so is assembling an archive alongside the Guildhall School of Music and Drama https://www.barbican.org.uk/our-story/our-archive/about-the-archive https://matthewharle.com/Barbican-Archive The Alan Turing Institute https://www.turing.ac.uk/ is the national institute for data science and artificial intelligence running a host of research projects into topics including AI, Public Policy and Living with Machines - a project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution. https://livingwithmachines.ac.uk You can hear more from historian Emma Griffin in this conversation about Understanding the Industrial Revolution https://www.bbc.co.uk/programmes/p081y7h4 |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.bbc.co.uk/programmes/m000vydf |
Description | G. Solomon and J. Rhodes 'Work, Occupational Change, and Technological Adoption: Britain, 1851-1911', European Social Science History Conference (ESSHC) 2023, Gothenburg, Sweden, 13/04/23 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation of work on occupational change using digitised census data to ESSHC 2023 in Gothenburg, Sweden. Gained feedback from discussant and auidence members, and broadened engagement with our methodological approaches (nominal linkage and street geo-coding)/research findings (human capital formation in the bicycle industry). Generated significant discussion, and resulted in future plans for participation in a 'best practice' workshop. |
Year(s) Of Engagement Activity | 2023 |
URL | https://esshc.socialhistory.org/conference/programme?day=95&time=328&session=5399&textsearch=solomon... |
Description | Giorgia Tolfo & Timothy Hobson online poster presentation at Data for History (June 2021): Modelling Time, Places, Agents (Berlin) entitled "Supporting an interdisciplinary research agenda through meta-modelling. The case of LwM" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A poster was presented for discussion with academic conference attendees on the subject of the "meta-modelling" approach taken to conceptual data modelling within the LwM project. |
Year(s) Of Engagement Activity | 2021 |
Description | Hacking 23 years of government history: An example from The UK Government Web Archive |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Turing blog: Web archives provide a key resource for the public. They allow us to access a wide range of data reflecting all areas of a society but, as they are large and meticulously maintained datasets, they can be daunting and difficult to navigate. The Alan Turing Institute and The National Archives co-organised a Data Study Group challenge. Data Study Groups (DSGs) are events hosted by the Turing, which bring together some of the top talent from data science, artificial intelligence, and wider fields from across the world, to analyse real-world data science challenges. The culmination of that work is now available to read via the published Data Study Group report 'Discovering topics and trends in the UK government web archive' |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.turing.ac.uk/blog/hacking-23-years-government-history-example-uk-government-web-archive |
Description | Historical Hypothesis Generation (BlogPost) |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Long blog post outlining an element of our interdisciplinary method. |
Year(s) Of Engagement Activity | 2020 |
URL | https://livingwithmachines.ac.uk/historical-hypothesis-generation-hypgen/ |
Description | Hunting for Treasure: Living with Machines and the British Library Newspaper Collection. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation "Hunting for Treasure: Living with Machines and the British Library Newspaper Collection." at the Impresso Workshop in Lausanne (held online) |
Year(s) Of Engagement Activity | 2020 |
URL | https://impresso.github.io/eldorado/online-program/ |
Description | IIIF conference lightning talk: IIIF and machine learning inference: a love story? |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Lightning talk as part of the IIIF conference discussing the use of IIIF and compute3r vision to work with a Library of Congress collection of digitised newspapers. |
Year(s) Of Engagement Activity | 2021 |
URL | https://iiif.io/event/2021/annual_conference/ |
Description | Implications of AI for Libraries presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | A presentation as part of a post-graduate library science talk on the implications of AI drawing examples for the Living with Machines project. |
Year(s) Of Engagement Activity | 2020 |
Description | Information+ Conference talk: Olivia Vane, Kasra Hosseini, Katherine McDonough and Daniel CS Wilson - 'Maps in Time: Visualising the historical Ordnance Survey' |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A division is often made between maps and timelines. This presentation from the Living with Machines project explores combining the two, visualising a dataset of 130,000 maps from the early Ordnance Survey (OS), Britain's national mapping agency. It was the OS who, from the early 19th century, created the first comprehensive, detailed and accurate picture of Great Britain. We show how animated data graphics can bring the story of the maps to life for a popular audience. We also visualise the data by space and time to support analysis in research. |
Year(s) Of Engagement Activity | 2021 |
URL | https://vimeo.com/598429189 |
Description | Intro to D3 session for Alan Turing Institute REG |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Teaching an 'Introduction to D3.js' for the Alan Turing Institute Research Engineering Group lunchtime tech talks. 2hr session: presentation and going through tutorials. |
Year(s) Of Engagement Activity | 2020 |
Description | Introduction to Computer Vision for Digital Heritage using Living with Machines research |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation as a part of the day-long conference organized by Polly Hudson to review Colouring London and related research for applications with Historic England and adjacent agencies. |
Year(s) Of Engagement Activity | 2021 |
URL | https://colouringlondon.org/ |
Description | Introduction to Jupyter Notebooks: the weird and the wonderful |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | An online workshop focused on potential uses of Jupyter Notebooks in a GLAM (Galleries, Libraries, Archives and Museums) settings. |
Year(s) Of Engagement Activity | 2021 |
URL | https://github.com/Living-with-machines/Jupyter-Notebooks-The-Weird-and-Wonderful |
Description | Introduction to Python, with Mariona Coll Ardanuy, July 19th 2019, organised by Mariona Coll Ardanuy for Turing Community |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | 4-hour introductory course to programming for the Humanities, with a focus to text processing and data wrangling (e.g. opening and working with documents and file paths). The feedback was very positive. Participants got acquainted with the basics of Python programming, which they have been able to apply to the project in multiple occasions. |
Year(s) Of Engagement Activity | 2019 |
Description | Invited lecture, Luxembourg Centre for Contemporary and Digital History (C2DH) Hands-on History lecture series |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | My talk, 'Crowdsourcing in Living with Machines: crowdsourcing for engagement meets data science research', sparked a rich discussion afterwards |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.c2dh.uni.lu/events/crowdsourcing-living-machines-crowdsourcing-engagement-meets-data-sci... |
Description | Invited talk 'Living with Machines', University of Aarhus |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Talk about the Living with machines project for DH colleagues at Aarhus |
Year(s) Of Engagement Activity | 2022 |
Description | Invited talk on Computer Vision research in LwM for the Association of Geographic Information-Scotland. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | About 150 people attended a talk about Living with Machines research with historical maps. |
Year(s) Of Engagement Activity | 2021 |
Description | Invited talk, Princeton University, 'Crowdsourcing and the Humanities' |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited talk and panel discussion for an event with the Center for Research Data and Digital Scholarship at University of Pennsylvania Libraries, The Center for Digital Humanities at Princeton University Library, the Princeton Geniza Lab, and the Zooniverse, attended by c40 people. The panel and event sparked extended discussion on social media. |
Year(s) Of Engagement Activity | 2021 |
URL | https://genizalab.princeton.edu/crowdsourcing-and-the-humanities |
Description | Invited talk: Crowdsourcing in cultural heritage lecture for Institut für Kunstgeschichte, LMU München |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | An invited talk for a German seminar group. |
Year(s) Of Engagement Activity | 2021 |
Description | Invited talk: User Experience (UX) for Citizen Science , iDigBio event |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I was invited to speak at the event 'Biodiversity Digitization: Celebrating a decade of progress' in the session 'Innovations: Strategy & Coordination'. My talk outlined the importance of user experience design (UX) for increasing diverse participation in citizen science projects. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.idigbio.org/wiki/index.php/Biodiversity_Digitization:_Celebrating_a_decade_of_progress |
Description | J. Rhodes and G. Solomon 'New perspectives on occupational change: Britain, 1851-1911', North American Conference on British Studies (NACBS) 2022, Chicago, 11/11/2022 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation of work on occupational change using digitised census data to NACBS 2022 in Chicago. Aim was to get feedback on our paper from discussant and audience members and to promote our new approaches (geocoding and nominal linkage). Audience questions and discussant's comments provided important feedback on how to shape the paper for future publication. The session raised awareness of LwM's work on census data within the historical discipline. |
Year(s) Of Engagement Activity | 2022 |
Description | J.Rhodes, J. Lawrence, D. Wilson, K. Beelen, K. McDonough, "Beyond the Tracks" presentation at DH2022 (online/Tokyo), |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | About 25 participants joined the panel where we presented a paper "Beyond the tracks" for the Digital Humanities 2022 conferencel. |
Year(s) Of Engagement Activity | 2022 |
URL | https://dh2022.dhii.asia/dh2022bookofabsts.pdf |
Description | Jon Lawrence, Inter-Disciplinary Research Programme Assessor for British Academy - 'The Humanities and Social Sciences Tackling the UK's International Challenges' (2019) |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Assessing projects under the heading " The Humanities and Social Sciences Tackling the UK's International Challenges" |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.thebritishacademy.ac.uk/programmes/tackling-uk-international-challenges |
Description | K. Beelen, K. McDonough, "Maps and Machines: using computer vision to analyze the geography of industrial change (1790-1920)", University of Aberdeen DH Seminar, 26 Oct 2021 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation of maps research to DH community at the University of Aberdeen. |
Year(s) Of Engagement Activity | 2021 |
Description | K. Beelen, K. McDonough, DCS Wilson, J. Lawrence, K. Westerling, "The 'Environmental Scan' at work: radical contextualisation of newspaper collections for new historical research," DH2023 Long Paper, 10-14 July. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | About 30 people attended this presentation at the 2023 DH conference and it has sparked conversation about the Environmental Scan method for non-UK collections as well as discussion about the accessibility of historical newspaper collections. |
Year(s) Of Engagement Activity | 2023 |
Description | K. Beelen, M. Coll Ardanuy and F. Nanni: "Breaking (the?) news in the nineteenth century", Knowledge, Information and Data Science (KIDS) group, University Collect London (UCL), London |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers. |
Year(s) Of Engagement Activity | 2022 |
Description | K. Beelen, M. Coll Ardanuy and F. Nanni: "Living with Machines: Analysing Digital Heritage at Scale", Digital Humanities Lab Exeter, University of Exeter |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | We presented the results of a series of collaborations at the intersection of digital history, computational linguistics and software engineering focused on the use of our large digital collection of 19th Century newspapers. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.exeter.ac.uk/news/events/details/index.php?event=11894 |
Description | K. McDonough "Maps as Data," OBTIC Séminaire, Paris, France, 3 June 2022 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Talk to Paris-based research group working on text analysis with historical documents, attended by about 30 people. |
Year(s) Of Engagement Activity | 2022 |
URL | https://obtic.sorbonne-universite.fr/actualite/je-analyse-spatiale-des-textes-litteraires/ |
Description | K. McDonough, "DH Careers: Beyond the Professoriate," CESTA, Stanford University, 15 Feb. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | 30 PhD students at Stanford University attended this workshop to discuss career opportunities in the digital humanities. |
Year(s) Of Engagement Activity | 2022 |
URL | https://cesta.stanford.edu/events/dh-careers-beyond-professoriate |
Description | K. McDonough, "Maps as Data for Open Historical Research," Roundtable on AI and the Historical Profession: Applications and Implications, American Historical Association Annual Meeting, San Francisco, CA 4-7 Jan. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Talk about the opportunities for using maps and AI methods in historical research at the American Historical Association. Attended by around 50 people. |
Year(s) Of Engagement Activity | 2024 |
URL | https://aha.confex.com/aha/2024/meetingapp.cgi/Session/25011 |
Description | K. McDonough, "Maps as [Open] [Humanities] Data: From Access to Analysis," Reimagining Industry/Academic/Cultural Heritage Partnerships in AI Workshop, AEOLIAN Network (Artificial Intelligence for Cultural Organisations), [virtual] 25 Oct. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation to international audience about the maps research in Living with Machines, in particular the issues around ethical use of heritage resources in digital research. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.aeolian-network.net/events/workshop-2/ |
Description | K. McDonough, D. Wilson, K. Beelen, G. Solomon, "Historians Among the Machines: From Reproducible Computational Experiments to Persuasive Historical Arguments" session, American Historical Association Annual Meeting, San Francisco, CA 4-7 Jan. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentations from 4 former Living with Machines PDRAs at the American Historical Association on a panel dedicated to the project. Chaired by Lauren Tilton (University of Richmond). |
Year(s) Of Engagement Activity | 2024 |
URL | https://aha.confex.com/aha/2024/meetingapp.cgi/Session/25012 |
Description | K. McDonough, K. Hosseini, "Maps as Data" for Turing Catch Up Monthly Meeting, Jan 24 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Lightning talk about Maps research within Living with Machines during the monthly Turing Catch Up. Resulted in several inquiries about new applications, further research with MapReader. |
Year(s) Of Engagement Activity | 2022 |
Description | Kaspar Beelen "Surveying the Newspaper Landscape" (CREATE Salon, February) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation at the University of Amsterdam attended by ca. 20 people. It was part of the "Salon" series organized by CREATE Amsterdam (Julia Noordegraaf). |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.create.humanities.uva.nl/ |
Description | Kaspar Beelen Presentation for the British Library News Collection Group |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation on the digitization of the Newspaper Press Directories and how this feeds into understanding the shape and contours of digital newspaper collections. |
Year(s) Of Engagement Activity | 2020 |
Description | Kaspar Beelen and Katherine McDonough Keynote presentation the Computational Archival Science symposium "Surveying the Land" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Keynote presentation by Kaspar Beelen and Katherine McDonough at the Computational Archival Science Symposium, organized at the Alan Turing Insitute (January 2020). |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.turing.ac.uk/events/computational-archival-science-cas-symposium |
Description | Kaspar Beelen, Invited talk "Stereotypes in Newspaper data" at the Dutch National Library Research Week, September 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | Presentation at the Dutch Royal Library (KB) to report on the progress of my Research in Residence programme. It was part of the KB "Research Week" and was the most popular in terms of people signing up. |
Year(s) Of Engagement Activity | 2019 |
Description | Kaspar Beelen, Panel discussion on Coding Literacy in the Digital Humanities, at Digital Humanities Benelux, September 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Participation in a round table on the topic of "Coding Literacy in the Humanities" (organized by Marijn Koolen, Liliana Melgar and Mari Wigham). The round table included a presentation with different experts (Joris van Zundert, Elli Bleeker, Sally Chambers) and discussion with an audience of Digital Humanities experts. |
Year(s) Of Engagement Activity | 2019 |
URL | http://2019.dhbenelux.org/wp-content/uploads/sites/13/2020/01/DH_Benelux_2019_paper_25.pdf |
Description | Kaspar Beelen, Presentation on "Bias in the British Newspaper Archive" at Digital Humanities Benelux, September 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | 15 minutes paper presentation on the work the emerged out of the Sources Lab, focussed on understanding the newspaper landscape.. Attended by ca. 25 people, from various backgrounds (DH researchers, librarians,) |
Year(s) Of Engagement Activity | 2019 |
URL | http://2019.dhbenelux.org/wp-content/uploads/sites/13/2019/08/DH_Benelux_2019_paper_33.pdf |
Description | Kaspar Beelen, Presentation on "The Agency of Machines" at Digital Humanities Benelux, September 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Presentation reporting on the "The Agency of Machines" at the poster session of Digital Humanities Benelux, 2019. It involved discussion with many interested attendants of the conference. |
Year(s) Of Engagement Activity | 2019 |
URL | http://2019.dhbenelux.org/program/ |
Description | Kaspar Beelen, Seminar on History and Text, Antwerp University, November 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Undergraduate students |
Results and Impact | Presentation on the use of Text Mining for History. Part of the course "History and Language" (BA2) organised by Marnix Beyen (University of Antwerp). |
Year(s) Of Engagement Activity | 2019 |
Description | Katherine McDonough and Jon Lawrence, "An introduction to Living with Machines," University of Exeter DH Seminar, 23 October 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Other audiences |
Results and Impact | Presentation to about 40 people at the DH Seminar at Exeter was a great opportunity to make contact with the expert community there and introduce them to our ongoing work. |
Year(s) Of Engagement Activity | 2019 |
URL | http://www.exeter.ac.uk/news/events/details/index.php?event=9637 |
Description | Katherine McDonough organized meeting with US experts in historical map processing using computer vision (29/8/2019 and 1/11/2019) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Conversation to plan for future collaboration with researchers working at the cutting edge of computer vision for historical maps in the United States. |
Year(s) Of Engagement Activity | 2019 |
Description | Katherine McDonough, "Living with Machines," invited presentation at Spatial Relationships in Text as Data, The Alan Turing Institute, October 28, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Invited talk to review applications of research on qualitative spatial relations in the Living with Machines project. Question session offered an opportunity to learn about related research in the UK and to share our ongoing work with leaders in the field. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.eventbrite.co.uk/e/spatial-relationships-in-text-as-data-tickets-76259685773 |
Description | Katherine McDonough, "Living with Machines," presentation at DH Seminar, Center for Spatial and Textual Analysis, Stanford University, December 2 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | About 60 people attended a presentation at Stanford University about Living with Machines. This conversation has created substantive links to the DH community at Stanford and there is continued interest in collaborating with us in the future. |
Year(s) Of Engagement Activity | 2019 |
URL | https://cesta.stanford.edu/events/cesta-seminar-dr-katie-mcdonough |
Description | Katherine McDonough, Fantastic Futures, invited presentation and workshop on computer vision for historical maps, 4-5 December 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Presented Living with Machines research on computer vision with maps during a roundtable on applications of AI in GLAM institutions, generating conversation with an international audience about working with visual heritage materials at scale. The workshop offered GLAM staff, researchers, and policy leaders an opportunity for hands-on experience in computer vision, which has translated into invitations for collaboration and additional teaching opportunities. |
Year(s) Of Engagement Activity | 2019 |
URL | https://fantasticfutures.stanford.edu/ |
Description | Katie McDonough, Olivia Vane, and Daniel Van Strien gave a '21st Century Talk' for British Library staff: 'Maps and Machines: Using Computer Vision to Analyze the Geography of Industrialization (1780-1920)', 14 Jan 2020 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Delivered a talk about using computer vision techniques to analyse digitised historical maps at scale. |
Year(s) Of Engagement Activity | 2020 |
Description | LWM listed at Genealogy Stories "10 Websites for the History of Ordinary People" |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Article in Medium listing good websites the public to find out more about the history of ordinary people included Living with Machines as a recommended source. |
Year(s) Of Engagement Activity | 2021 |
URL | https://genealogystoriesuk.medium.com/10-websites-for-the-history-of-ordinary-people-9ecc8b1b4832 |
Description | Lab Talk for Workshop on Visualization for the Digital Humanities |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Olivia Vane gave a lightning talk at the online 5th Workshop on Visualization for the Digital Humanities about the British Library Digital Scholarship department. |
Year(s) Of Engagement Activity | 2020 |
URL | http://vis4dh.org/ |
Description | Learn more about Living with Machines at events this winter |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | A blog post to promote exhibition-related events to be held at Leeds City Museum and online. |
Year(s) Of Engagement Activity | 2022 |
URL | https://blogs.bl.uk/digital-scholarship/2022/10/learn-more-about-living-with-machines-at-events-this... |
Description | Learn more about what AI means for us at Living with Machines events this autumn |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post in support of events co-organised by two Co-Is and held at Leeds City Museum as part of the Leeds Digital Festival. |
Year(s) Of Engagement Activity | 2022 |
URL | https://blogs.bl.uk/digital-scholarship/2022/09/learn-more-about-living-with-machines-at-our-events.... |
Description | Library Carpentry session 1 (workshop) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop 1 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff |
Year(s) Of Engagement Activity | 2020 |
Description | Library Carpentry session 2 (workshop) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop 2 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff |
Year(s) Of Engagement Activity | 2020 |
Description | Library Carpentry session 3 (workshop) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop 3 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff |
Year(s) Of Engagement Activity | 2020 |
Description | Library Carpentry session 4 (workshop) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop 4 of a series of Library Carpentry workshops (https://librarycarpentry.org/) delivered online to British Library Staff |
Year(s) Of Engagement Activity | 2020 |
Description | Linking Geo-Data through Test and Play |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Tutorial on DeezyMatch, troubleshooting session, and final roundtable to discuss the tools useful in linking geospatial data from historical sources. |
Year(s) Of Engagement Activity | 2020 |
URL | https://github.com/LinkedPasts/LaNC-workshop |
Description | Living with Machine Documentary episode 2: The digitisation Process |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt. Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC). This episode focuses on the digitisation process. Find out more here: https://bit.ly/49sUXjs |
Year(s) Of Engagement Activity | 2024 |
URL | https://www.youtube.com/watch?v=aGF343ketqw&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=2&t=4s |
Description | Living with Machines Documentary Episode 3: Computational methods, infrastructure, and skills |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt. Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC). This episode focuses on computational methods, infrastructure, and skills. Find out more here: https://bit.ly/49sUXjs |
Year(s) Of Engagement Activity | 2024 |
URL | https://www.youtube.com/watch?v=cmW10eK-ojs&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=3&t=5s |
Description | Living with Machines Documentary episode 1: On Collaboration |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt. Living with Machines was funded by UK Research and Innovations (UKRI), via the Strategic Priorities Fund and was administered by the Arts and Humanities Research Council (AHRC). Find out more here: https://bit.ly/49sUXjs |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.youtube.com/watch?v=A__ZJgw4_00&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=1&t=48s |
Description | Living with Machines Documentary episode 4: The Environmental Scan |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This series of short videos in documentary form seeks to make visible the collaborative underpinnings of the project by highlighting the team's experiences, research objectives, challenges, and lessons learnt. This episode focuses on the method of the 'environmental scan', which quantifies how the digitisation policies of the past (i.e., what gets into digitised corpora) can bias the outcomes of analyses we run. |
Year(s) Of Engagement Activity | 2024 |
URL | https://www.youtube.com/watch?v=vTc4S3Zx9IA&list=PLuD_SqLtxSdWMYcu5YQDGqP9AGejg_cBb&index=4 |
Description | Living with Machines OCR hack |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A blog post outlining an internal 'hack' event focused on OCR. |
Year(s) Of Engagement Activity | 2019 |
URL | http://livingwithmachines.ac.uk/living-with-machines-ocr-hack/ |
Description | Living with Machines book launch 'Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project' |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The event was an online roundtable discussion, led by hosts Professor Jane Winters and Professor James Smithies, with the authors, Ruth Ahnert, Emma Griffin, Mia Ridge and Giorgia Tolfo. It celebrated and promoted the newly published book 'Collaborative Historical Research in the Age of Big Data: Lessons from an interdisciplinary project' (available open access by Cambridge University Press as part of the Elements Series). It was part of AI UK 2023. The Alan Turing Institute's national showcase of data science, machine learning and artificial intelligence research and innovation. At a series of events between 6 - 31 March 2023, AI UK Fringe brings together leaders in academia from across the UK's AI ecosystem to demonstrate, exhibit and update on their ground-breaking work. 196 Registrants, 70 online attendees, and recording of the session publicised at The Alan Turing Institute YouTube channel. |
Year(s) Of Engagement Activity | 2023 |
URL | https://livingwithmachines.ac.uk/event/book-launch-collaborative-historical-research-in-the-age-of-b... |
Description | MapReader Launch |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The MapReader Launch showcased librarians, historians, and data scientists discussing how Living with Machines has used this newly developed software library to experiment with National Library of Scotland Ordnance Survey maps. Launch participants had a chance to test MapReader with Ordnance Survey maps from the NLS and the British Library. Chris Fleet (NLS) and Nicole Colemen (Stanford) presented keynotes. On the afternoon of June 8, we heard from colleagues working on other open source, interdisciplinary projects that also explore historical map collections as primary sources. These projects are now featured as resources in The Alan Turing Institute's Computer Vision for Digital Heritage Special Interest Group new Tool Gallery. The MapReader Launch brought together historians and others with an interest in using digitized map collections as primary sources for computational research. Collectively, we learned about and discussed ways to encourage more open research in this space through skill development and shared digital resources and infrastructure. The impact of the Launch has been impressive: from enabling library curators to teach their own communities how to use MapReader with digitized map collections, motivating PhD and postdoctoral research with maps, and setting in motion future working collaborations with organisations like the Office of National Statistics, the University of Antwerp, Stanford Libraries, the National Archives (UK), and the French National Library, this exceptionally well-received event sets the stage for future MapReader research and development at an international level. |
Year(s) Of Engagement Activity | 2023 |
URL | https://livingwithmachines.ac.uk/event/mapreader-launch/ |
Description | MapReader Workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | In this workshop, we jumpstarted making MapReader more accessible and easy to use by bringing those people who have had some exposure to it already together for some intensive, shared work. We focused on 3 main activities: testing MapReader on existing data to identify bugs and opportunities for simplifying or improving the code or library design; developing approaches for evaluating and analyzing MapReader outputs that are meaningful to humanities and some social science research; and document needs for tutorials and software documentation in order to make MapReader more accessible to specific user groups (e.g. historians, curators, geographers). Impacts have included ongoing engagement with the software library as a research and teaching tool by invited participants, integration of comments into the MapReader roadmap, and significant improvements to MapReader functionality thanks to bug reporting. |
Year(s) Of Engagement Activity | 2023 |
Description | MapReader day at ANHIMO 2023 Sorbonne Summer School in Paris |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Talks and hands-on workshop introducing MapReader to computer science and digital humanities postgraduate students at the Summer School of Numerical Analysis of the History of the Sea and Oceans hosted by the Sorbonne University Alliance Ocean Institute and SCAI (Sorbonne Center for Artificial Intelligence) in Paris on 27 June 2023. Led by Katie McDonough, Andy Smith, and Daniel Wilson. Impacts include extending re-use of MapReader among historians in France. |
Year(s) Of Engagement Activity | 2023 |
URL | https://scai.sorbonne-universite.fr/public/events/view/a8046651d11c55bfbcd0/11 |
Description | Maps as Data: A Humanistic Approach to Computer Vision for Large Map Collections |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation for the Unlocking Historical Maps of Southeast Asia Webinar Series, organized by Jane Jacobs at Yale-NUS in Singapore. The virtual workshop session was attended by 55 students, scholars, and librarians who are developing projects that use computational methods to study digitised map collections. |
Year(s) Of Engagement Activity | 2020 |
URL | https://historicmapssea.commons.yale-nus.edu.sg/unlocking/ |
Description | Mariona Coll-Ardanuy, Presentation at CogSci seminar at QMUL (13/06/2019) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Other audiences |
Results and Impact | Talk at the Cognitive Science group at Queen Mary University of London, presenting preliminary research on the language lab work for Living with Machines. There were very relevant comments, and interesting questions as well. A subsequent talk at the Cognitive Science seminar was planned, which will take place on 25 May 2020. |
Year(s) Of Engagement Activity | 2019 |
Description | Mentions and promotion for Living with Machines Book Launch |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | The Living with Machines book launch was promoted in the following outlets: https://dhandlib.org/2023/02/23/resource-collaborative-historical-research-in-the-age-of-big-data/ https://ai-uk.turing.ac.uk/fringe-events/ https://royalhistsoc.org/calendar/collaborative-historical-research-in-the-age-of-big-data-lessons-from-an-interdisciplinary-project/ https://twitter.com/LivingwMachines/status/1630994408173608960?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1630994408173608960%7Ctwgr%5E4b01f17d06b6894ae94988c544585ee6ecaea262%7Ctwcon%5Es1_c10&ref_url=https%3A%2F%2Fpublish.twitter.com%2F%3Fquery%3Dhttps3A2F2Ftwitter.com2FLivingwMachines2Fstatus2F1630994408173608960widget%3DTweet |
Year(s) Of Engagement Activity | 2023 |
Description | Mia Ridge and Andre Piza, invited participants at AI and Storytelling workshop, Kings Digital Lab, Kings College London, Apr 1st 2019. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Researchers and industry reflected on on ways of collaborating in the field with particular attention to the challenges around engagement of Research Software Engineers, needed skills and project frameworks. Consolidated relationship between KDL and Living with Machines leading to a second meeting at the Turing with the KDL Director and 2 of their researchers with view of future collaboration. |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge and Olivia Vane presented at KQ Codes Technical Socials at University College London: 'Research software engineering at one of the world's largest libraries', 20 February 2020 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | The Knowledge Quarter (KQ) Codes Technical Socials at UCL are informal events for anyone with an interest in the computational methods and technology behind research and innovation. They are an opportunity to get to know fellow practitioners, and to discuss and learn about useful tools and techniques which may help with your work. We gave a presentation on research software engineering at the British Library, including a discussion of RSE roles on Living with Machines. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ucl.ac.uk/research-it-services/programming-hub/kq-codes-technical-socials |
Description | Mia Ridge initiated a meetup for scholars and institutions working with digitised newspapers for humanities research at Dh2019, Utrecht |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Group established to discuss the challenges and opportunities for scholars and institutions to collaborate using digitised newspaper collections |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge led panel discussion 'The Past, Present and Future of Digital Scholarship with Newspaper Collections' at Digital Humanities 2019 conference, Utrecht, July 10, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
URL | http://www.openobjects.org.uk/2019/07/the-past-present-and-future-of-digital-scholarship-with-newspa... |
Description | Mia Ridge presented 'Living with "Living with Machines": navigating the digital shift at scale' paper accepted for DCDC, Discovering Collections, Discovering Communities, organised by The National Archives and Research Libraries UK (November 2019) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Talk to cultural heritage audience at TNA about the project. |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge presented on the project Living with Machines to Alberta Comer, Dean and University Librarian, University of Utah (May 2019) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | Please add - compulsory |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge presented on the project at the Library of Congress's Digital Strategy Roundtable, Washington DC (June 2019) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | Please add - compulsory |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, invited presentation, 'Machine Learning and Digital Humanities' panel, University of Newcastle, Newcastle, September 5, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | As machine learning becomes more common across a wide range of digital solutions, and increasingly factors in our daily lives, it is also being used more frequently in humanities research projects. The possibilities of machine learning need to be understood by humanities researchers and the complexities of the problems investigated in the humanities by those working with machine learning technologies. The humanities can offer a wealth of historical data that presents new challenges to machine learning methodologies: historical records, pictorial representations, literary (or other) text. Recent Digital Humanities projects already employ some machine learning technology, such as with the development of Handwritten Text Recognition (HTR), but the diversification of the data investigated with machine learning approaches has the potential to lead the technology in new and unexpected ways with real-world applications. Panel members include: • Beatrice Alex (University of Edinburgh), • Noura Al-Moubayed (Durham University), • Mia Ridge (British Library), • Melissa Terras (University of Edinburgh). |
Year(s) Of Engagement Activity | 2019 |
URL | https://n8cir.org.uk/events/machine-learning-and-digital-humanities/ |
Description | Mia Ridge, invited presentation, British Library Data Projects workshop, London, August 19, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, invited presentation, Consortium of European Research Libraries (CERL) Annual Seminar 2019, Göttingen, October 9, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, invited presentation, KCL / British Library Research Collaboration workshop, Kings College London, September 27, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, invited presentation, Museums + AI Network workshop, Pratt Institute, New York, September 16, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.openobjects.org.uk/2019/09/museums-ai-new-york-workshop-notes/ |
Description | Mia Ridge, invited presentation, Princeton University Library, Princeton, September 13, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, invited presentation, Research Libraries UK International Symposium on Digital Scholarship, London, October 14, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This symposium explored the nature and extent of digital scholarship occurring within research libraries across the international research library community. It brought together representatives from international research library associations, funders, the academic community, and global-library collectives to discuss areas of potential cross-sector and interdisciplinary collaboration, and the routes and networks through which this might be achieved. Mia Ridge presented on "Building capacity for digital scholarship at a research library: Living with Machines, and the impact of data science" |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.rluk.ac.uk/digital-scholarship-and-the-role-of-the-research-library-symposium-slides/ |
Description | Mia Ridge, invited presentation, Wellcome Library, London, July 4, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Overview presentation on Living with Machines project |
Year(s) Of Engagement Activity | 2019 |
Description | Mia Ridge, presentation, Library of Congress Machine Learning Summit, Washington DC, September 20, 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Ridge touched on three main kinds of challenges: scale, operational and interdisciplinary, and copyright. A larger scale requires new worflows and quickly grows expensive, operationalizing raises the question of producing public-facing infrastructure, and copyright involves negotiating complex rights issues. |
Year(s) Of Engagement Activity | 2019 |
URL | https://labs.loc.gov/static/labs/meta/ML-Event-Summary-Final-2020-02-13.pdf?loclr=blogsig |
Description | Netherlands Film Festival 2020: Generous Interfaces panel |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Olivia Vane took part in a panel at the Netherland Film Festival 2020 (run online because of the Covid pandemic) on Generous Interfaces: "In the Generous Interfaces panel we investigate alternative ways to search audiovisual collections, using De Open Beelden Browser ('The Open Images Browser'). How can you enjoy exploring archives even if you're not looking for anything in particular?". Olivia gave a presentation and then participated in a panel discussion. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.filmfestival.nl/en/collection/nff-conferentie-generous-interfaces/ |
Description | New exhibition considers the human impact of rapid technological change in the 19th century |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Press release for the exhibition 'Living with Machines' at the Leeds City Museum, published at turing.ac.uk on the occasion of the opening. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.turing.ac.uk/news/new-exhibition-considers-human-impact-rapid-technological-change-19th-... |
Description | Newspapers in 'Living with Machines' |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | An invited talk on the British Library's Newspaper collection and the Living with Machines project for Congre`s Me´dias 19 - Numapresse : Presses anciennes et modernes a` l'e`re du nume´rique, La BnF, 3 juin. 2022 |
Year(s) Of Engagement Activity | 2022 |
URL | https://figshare.com/articles/presentation/British_Library_Newspapers_and_Living_with_Machines/19963... |
Description | Olivia Vane, Katherine McDonough, Daniel van Strien, 21st Century Curator Talk (British Library staff talks), Maps and Machines: Using Computer Vision to Analyse the Geography of Industrialization (1780-1920), January 13, 2020 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Talk onUsing Computer Vision to Analyse the Geography of Industrialization (1780-1920) |
Year(s) Of Engagement Activity | 2019 |
Description | Panel discussion: Expanding and Enriching Metadata through Engagement with Communities |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This panel discusses how cultural institutions are engaging various communities to co-create academic research and/or object metadata in order to increase representation and access to collections; highlighting how this is done in different ways to engage specific audiences and goals, i.e. graduate student assistantships, museum interactive experiences, crowdsourcing, and professional action groups. |
Year(s) Of Engagement Activity | 2021 |
URL | https://mcn2021virtual.sched.com/event/lwrc/expanding-and-enriching-metadata-through-engagement-with... |
Description | Paper submission: Hunting for Treasure: Living with Machines and the British Library Newspaper Collection1 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Abstract: This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to make digitisation choices both transparent and pragmatic. Working towards solutions that reflect collaborations between library staff and scholars, we introduce: a) Press Picker, our custom visualisation tool designed to support decision making about digitisation; and b) the Environmental Scan, a process of automatic metadata generation from the Newspaper Press Directories, a contemporaneous record of British newspapers. |
Year(s) Of Engagement Activity | 2020 |
Description | Participation in the "Computational Approaches for Digitized Historical Newspapers" Dagstuhl Seminar |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | About 20 international researchers were invited to participate in the "Computational Approaches for Digitized Historical Newspapers (22292)" Dagstuhl Seminar, among which two members of the Living with Machines project: Kaspar Beelen and Mariona Coll Ardanuy. Dagstuhl research seminars focus on the exchange and development of ideas on current topics in computer science. In this particular edition, the focus was on analysing successes and limitations of current computational approaches to historical newspapers, and discuss future challenges, potential solutions and common strategies. The outcomes of the discussions and findings were published in a report. |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.dagstuhl.de/en/seminars/seminar-calendar/seminar-details/22292 |
Description | Plenary talk at Conference on interdisciplinary and transdisciplinary research for sustainable development (UCLouvain, Belgium) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | The talk sparked many questions, both during and after the event. Several people were interested in knowing more about the activity and noted how it provided them with a completely fresh perspective on how issues in humanities can be investigated. I also received proposals for further engagement by multiple people, including the organisers of the event. |
Year(s) Of Engagement Activity | 2022 |
URL | https://uclouvain.be/en/discover/university-transition/conference-sur-la-recherche-interdisciplinair... |
Description | Podcast interview: Crowdsourcing with Dr Mia Ridge, MadeTech Making Tech Better podcast |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | What is crowdsourcing, and how is it used to improve the British Library's online cultural heritage collections? Clare Sudbery talks to crowdsourcing expert Dr Mia Ridge about the power of volunteer digital engagement. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.madetech.com/resources/podcasts/episode-14-mia-ridge-2/ |
Description | Poster submission: Data for History in Berlin |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Poster submission for conference Data for History in Berlin (May 2020) approved. Due to the covid-19 the conference has been postponed till May 2021. |
Year(s) Of Engagement Activity | 2020 |
URL | https://d4h2020.sciencesconf.org/ |
Description | Presentation 'Historic Census Data and Living with Machines' to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Presentation on 'Historic Census Data and Living with Machines' delivered by Josh Rhodes and Guy Solomon to Free UK Genealogy's 2021 conference on Open, Global Genealogy (22nd May 2021). The presentation gave genealogical professionals, family historians, and other members of the public an insight into how Living with Machines is using historic census data. In particular, we focused on our use of open census data, which is in line with the Free UK Genealogy's mission to provide free, online access to historic British census data. The presentation was delivered to an audience of c. 100 on Zoom, and has since received > 200 views on YouTube. Presenting at this conference enabled the Living with Machines project to establish a closer relationship with Free UK Genealogy, and to begin conversations about sharing data. The presentation also engaged members of the public, who expressed interest in our use of census data, and changed people's minds about what was possible to achieve at scale with historic census data. |
Year(s) Of Engagement Activity | 2021 |
URL | https://youtu.be/EY7mwn_sHHU?t=716 |
Description | Presentation at CogSci seminar at QMUL |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | Talk at the Cognitive Science group at Queen Mary University of London, presenting research on "Animate Machines: A study on atypical animacy detection". |
Year(s) Of Engagement Activity | 2020 |
URL | http://imc.eecs.qmul.ac.uk/wiki/index.php/Abstract_Mariona_Coll_Ardanuy_25_March_2020 |
Description | Presentation at the ESPRit Online Seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Kaspar gave a presentation for the European Society for Periodical Research (ESPRit) Online Seminar on 20 January 2023. The theme of the seminar was: "New Computational Approaches to Periodical Studies". The title of the presentation was: "Mining Victorian Metadata. A computational analysis of historical press directories" |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.espr-it.eu/news/events/167-esprit-seminar-20-january-2023 |
Description | Presentation at the KBR Digital Heritage Seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Presented "Assessing Biases in Digitized Newspaper Collections" at the Digital Heritage Seminar organized by the Royal Library of Belgium, (May 25, 2023) |
Year(s) Of Engagement Activity | 2023 |
Description | Presentation by Daniel Wilson and Ruth Ahnert at Text Mining Parliamentary Data Seminar, University of Umea, Sweden. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | "Tracing the language of machines across genres: books, journals and newspapers", Academic Presentation by Daniel Wilson and Ruth Ahnert to High Profile International Seminar featuring luminaries of the field such as Mark Algee-Hewit and chaired/respondent by Prof. Jo Guldi. Much interest generated in our method. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.umu.se/en/events/comparing-parliaments-novels-and-newspapers_10768814/ |
Description | Presentation for the C2DH group in Luxembourg |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Presentation for the C2DH group in Luxembourg. The presentation was part of the "Hands-on History" lectures. |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.c2dh.uni.lu/events/living-machines-digital-perspectives-industrial-revolution |
Description | Presentation for the History Department at the University of Antwerp |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Presentation on digital methods for history for the History Department at the University of Antwerp. |
Year(s) Of Engagement Activity | 2021 |
Description | Presentation for the Parliamentary Data Seminar (14/10/2021) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Presentation on the Targeted Sense Disambiguation during the Parliamentary Data Seminar on the topic "What's really going on". |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.umu.se/en/events/text-mining-parliamentary-data-seminar-what-is-really-going-on-_1084277... |