Lost Visions: retrieving the visual element of printed books from the nineteenth century

Lead Research Organisation: Cardiff University
Department Name: Sch of English Communication and Philos

Abstract

Ours is an age when images proliferate at unprecedented speed thanks to digital technology. So it is ironic that our past visual culture is seriously threatened by the very same technical advances. Many verbal texts are now stored and delivered by machine, and digital search techniques help us understand the meaning, provenance and reception of these texts. However, for reasons of economy or of complex IPR issues, the illustrations in these texts are frequently omitted. When included, they are often of low quality and without the metadata which are needed for understanding them. While their market value hardly suffers because of these shortcomings, their use as sources of aesthetic, literary and historical information is seriously compromised. There is a risk that tomorrow's readers will be almost unaware of illustration, even though research has shown that illustrated texts have qualities, meanings and strategies which are very different from those of texts that are not illustrated, and even strikingly different from those of the same verbal texts stripped of their images. Whole genres of artistic, educational and informative products may effectively be lost to us.

A unique opportunity to start addressing this issue is provided by a dataset held by the British Library. This dataset consists of 68,000 digitised volumes (approx. 25 million pages) covering 'Literature', 'Philosophy', 'General History' and 'Geography' from the Long Nineteenth Century, a period that is arguably the most important in British book illustration. In these years, rapid changes in reproductive techniques were paralleled by changes in the meanings of art and its reception. Art was democratised and book illustrations became more widely collectable and mobile than ever before.

The pages in the dataset are scanned at high resolution and OCR'd and the presence of non-verbal objects found in them is noted. We will use computational methods to identify the visual characteristics of these non-verbal objects (i.e. whether they partake of the nature of maps, diagrams, tables, graphs, significant typographic layout etc.). We aim to add to the available metadata by giving full bibliographical details of the book, the exact location and size of the image, and, where possible, a caption or title, and an artist. In addition, we will develop tools for identifying the re-use of images (in the form of reproductions, re-drawings, recuttings and transmediations). The result will be a web-based interface that provides searchable access to a database of images with accompanying metadata. As such, this project will rescue large numbers of images which would otherwise to all intents and purposes be lost. While the size of the grant does not allow for a full analysis of the subject matter of these illustrations, we shall trial crowd-sourcing methods for assigning tags. We are currently in discussion with a data management SME on how collaboration could increase the effectiveness and decrease the unit cost of semantic tagging on a large scale.

This is a project big in itself, but far bigger in its eventual impact. It has the potential to revolutionise how illustration is understood and the importance accorded to it, to supply an image-hungry commercial world with illustrative material, and to lead to ever more accurate ways of classifying and analysing images in large databases.

To deal with issues which arise during the project and to implement an impact strategy which will be felt well beyond it, we have set up a substantial panel of collaborators and external advisers who have expertise in the project itself and both expertise and influence in the world of potential users.

Planned Impact

The aim of 'Lost Visions', as the title suggests, is to make thousands of neglected images more widely visible. A database of this size and breadth will have a multitude of potential users. However, we shall work to target the following main constituencies:

Librarians: such a large data source will be of interest to libraries in general because of new information about publication and publishing history. Through reference librarians, we can reach members of the public such as those studying aspects of local history, family history and the origins of the modern industrialised world.

General public: the appeal of the database lies in the fact that, as well as making many hitherto neglected images and artists visible to the public eye, it includes some of the 'greats' of the period, including John Everett Millais and Frederic Leighton, and major foreign illustrators such as Retzsch, Gavarni and Gustave Doré.

Education: during workshops for school teachers and pupils held as part of the AHRC-funded DEDEFI project, we identified a need for more historical digital image sources, which this project will provide.

Picture researchers: a large number of images from this period are used on factual television and news and current affairs programmes. The availability of so many images from one source will be of enormous benefit to picture researchers.

Trainers of picture researchers e.g. University of Plymouth, Oxford Brookes University, the London College of Communication, the London School of Publishing, Picture Research Courses Online.

Employers of picture researchers: the numerous requests from media companies, publishers and advertising agencies for images contained in the Database of Mid-Victorian Illustration (DMVI) suggests that there will be significant demand for images from this much larger database.

Creative artists, including illustrators and graphic designers: this constituency has shown considerable interest in the wood-engraved images in DMVI. This iterest will be intensified in a database that encompasses a variety of reproductive techniques and artistic styles. In terms of design and reproductive techniques, the period covered by the dataset is, arguably, the most significant in British book illustration. A period of rapid technological development saw the rise and fall of etching, wood engraving and photomechanical techniques. This collection of illustrations is, in itself, a history of nineteenth-century design that will be of interest to graphic designers and those working in related fields.

Book collectors and sellers will profit from the added metadata attached to the volumes

Commercial: the project will provide a variety of analytic techniques which are in some degree portable across domains, and, in particular, will provide models of hardware architecture to those carrying out large-scale comparison of images. We are planning a Knowledge Transfer Project with Capture, an SME in digital data management.

Third Sector: the National Art Library Collections, the Department of Word and Image at the Victoria and Albert Museum, the Wellcome Library, the House of Illustration. We shall also target holders of specialist image collections, such as Collage (Guildhall Art Gallery and London Metropolitan Archives), the Boston Museum of Fine Arts (MA, USA), and the Ashmolean Museum, whose collections will be complemented by our database.

Publications

10 25 50
 
Title REimagine: a creative competition 
Description Entrants to this competition, organised by The Illustration Archive team, were encouraged to re-create a piece of art based on illustrations from the Archive. We had over 200 entries, primarily from schoolchildren 
Type Of Art Artwork 
Year Produced 2015 
Impact The activity led to more public engagement with, and awareness of, the Archive and more understanding of historic illustrations (we asked entrants to describe what they had learnt from taking part in the competition). 
URL https://www.youtube.com/watch?v=J81veSzrZr8
 
Description The aim of the project is to make searchable online over a million illustrations from books in the British Library's collection and to add to the existing metadata. We achieved this in the following ways:
1. By creating a searchable digital resource dedicated to book illustration
2. By creating a system for crowdsourced image tagging, including geotagging
3. By developing tools for image comparisons
4. By working on the interface and search functions to ensure usability across different constituents.
5. To include features on the site to appeal to different constituents (e.g. classroom resources; exhibition creation).
Exploitation Route By making these illustrations searchable, we have effectively opened up this rich resource for multiple use. Illustrations from the archive have already been used in a number of research projects (e.g. by scholars working on Robin Hood, the Indian Mutiny, and Women in Trousers). Testament to the use of the Archive can be found in the impressive tagging statistics: to date we have over 185k image tags. A user survey on the site (2018) has also captured evidence of how the resource is being used in research, education, commercial contexts and for recreation.

The infrastructure of The Illustration Archive and DMVI (an earlier funded project) have been utilised in other image archives around the world, including Yellow Nineties Online, the Victorian Illustrated Shakespeare Archive and Illustrating Scott.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Culture, Heritage, Museums and Collections

URL http://lostvisions.weebly.com/blog
 
Description During the course of the project, we held workshops with secondary school teachers and librarians in which we discussed ways of using the Archive. Based on the findings of these workshops, we developed learning and teaching materials, which are mounted on the website. The findings have been disseminated at conferences around the world. In 2015, we launched the REimagine project, which invited particpants to take an illustration from The Illustration Archive and imaginatively recreate it. We received over 200entries along with feedback from entrants detailing what they have learned about illustration from the Archive. In 2017, with funding from Cardiff University's Data Innovation Research Institute, we explored the digital mapping of the illustrations. This included the organisation of a Digital Mapping workshop, attended by delegates from across the museum and library sector. We have also used material from the Archive in Study Days at a local primary school (2017; 2018). The Illustration Archive is currently being developed as an impact case study for REF2020 based on its impact in increasing understanding of illustration, the public engagement enabled by the crowdsourcing infrastructure, its impact on teaching and learning, and its technological impct on other visual archives. In addition, a funding bid is in preparation for AHRC follow on funding for impact and engagement (in collabroation with the University of Sussex).
First Year Of Impact 2015
Sector Digital/Communication/Information Technologies (including Software),Education
Impact Types Cultural,Societal

 
Description Data Innovation Research Institute seedcorn funding
Amount £8,000 (GBP)
Organisation Cardiff University 
Sector Academic/University
Country United Kingdom
Start 01/2017 
End 06/2017
 
Title Lost Visions Illustration Archive database 
Description A collection of bibliographic and iconographic metadata that reference a million illustrations from the BL's collection. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact Intended to make the materials more discoverable 
 
Title Lost Visions Illustration Archive webserver 
Description This is a graphical user interface to the Illustration Archive Database, providing search functionality and discovery of "similar images". Also provides access to full page scans from linked images, and allows curation of customisable collections of images. The code is open source, and can be found at https://github.com/CSCSI/Lost-Visions 
Type Of Material Data handling & control 
Year Produced 2014 
Provided To Others? Yes  
Impact The webserver makes this material more readily searchable and accessible and has already lead to several research applications. 
URL http://illustrationarchive.cf.ac.uk/
 
Title Lost Visions Machine Vision Toolkit 
Description This suite of tools is currently under development, running image comparison algorithms on the ARCCA HPC infrastructure, to detect similarities between images. This code is available open source, and will be formally released at the end of the project. 
Type Of Material Data analysis technique 
Provided To Others? No  
Impact This is still under development 
 
Description Collaboration with ARCCA 
Organisation Cardiff University
Department Advanced Research Computing @ Cardiff (ARCCA)
Country United Kingdom 
Sector Academic/University 
PI Contribution In the Lost Visions project, we are adding a humanities big dataset to the largely scientific projects associated with ARCCA
Collaborator Contribution They have provided advice on the hosting of webservices, data storage techniques, data processing code development and selection of software packages. They also provide storage and infrastructure for the big dataset of images and the HPC resources for the processing of these.
Impact The storage and infrasatructure provided by ARCCA makes it possible to work on this large datasaet of over a million illustrations
Start Year 2014
 
Description Collaboration with British Library 
Organisation The British Library
Country United Kingdom 
Sector Public 
PI Contribution We are developing ways of making the dataset of illustrations searchable online
Collaborator Contribution The BL provided access to the big dataset
Impact The dataset provided by the BL provides the core data for the Lost Visions project
Start Year 2014
 
Description Collaboration with NSF 
Organisation National Science Foundation (NSF)
Country United States 
Sector Public 
PI Contribution Sharing ideas relating to Humanities Big Data research.
Collaborator Contribution They are consultants on the project and have provided information and expertise about their image matching algorithms and database setup choices, ideas of which we have used within our own implementions.
Impact The results of this collaboration feeds directly into the machine learning aspect of the project.
Start Year 2014
 
Title Lost Visions Illustration Archive Webserver 
Description This is a graphical user interface to the Illustration Archive Database, providing search functionality and discovery of "similar images". Also provides access to full page scans from linked images, and allows curation of customisable collections of images. The code is open source, and can be found at https://github.com/CSCSI/Lost-Visions 
Type Of Technology Webtool/Application 
Year Produced 2014 
Impact The webserver makes this material more readily searchable and accessible and has already lead to several research applications. 
URL http://illustrationarchive.cf.ac.uk/
 
Title Lost Visions Machine Vision Toolkit 
Description This suite of tools is currently under development, running image comparison algorithms on the ARCCA HPC infrastructure, to detect similarities between images. This code is available open source, and will be formally released at the end of the project. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact This is still under development, will be released at the end of the project. 
 
Description Demo of Lost Visions at KU Leuven 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Lots of discussion, sharing of ideas and plans for future collaboration

On the basis of this trip, we are planning a visit to Cardiff from members of the KU Leuven team working in Digital Humanities
Year(s) Of Engagement Activity 2014
 
Description Demonstration of Lost Visions at Agder 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Stimulating discussion followed the demonstration of the illustration archive

The possibility that the illustration archive could be used in different research contexts
Year(s) Of Engagement Activity 2014
 
Description Demonstration of Lost Visions at Digital Humanities Congress 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Demonstration of the database provoked discussion and potential collaboration with other projects in the field

The talk allowed contact to be made with researchers working in related areas
Year(s) Of Engagement Activity 2014
 
Description Demonstration of Lost Visions at Scola Normale Superiore, Pisa 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact We discussed Lost Visions in the cotext of the Pisa-run Capti project and saw many points of intersection

The possibility of further collaboration
Year(s) Of Engagement Activity 2014
 
Description Libraries and The Illustration Archive workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Interesting discussion amongst librarians that sparked discussion about how the Illustration Archive can be used

Participants agreed to make visitors aware of the Archive
Year(s) Of Engagement Activity 2015
URL http://lostvisions.weebly.com/blog
 
Description Lost Visions Schools workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact The workshop discussed ways in which the Lost Visions database could be used in the classroom


The teachers were interested in using the illustration archive in their own teaching and suggested ways in which it might be adoped for the International Baccalaureate, in particular
Year(s) Of Engagement Activity 2014
 
Description Presentation of Archive at Digital History Seminar, London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Presentation generated discussion afterwards

Future events planned
Year(s) Of Engagement Activity 2015
URL https://www.youtube.com/watch?v=LqDYsCsomnU
 
Description Presentation of Archive at Digital Material Conference, Galway 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Generated lots of discussion afterwards

Plans for future activity
Year(s) Of Engagement Activity 2015
 
Description Presentation of the Archive at mulhouse and Strasbourg 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk geberated lots of discussion

Plans for future events and collaborations were made
Year(s) Of Engagement Activity 2015
 
Description REimagine: an illustration workshop for children 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Real interest amongst children in the activity worksheets

Children showed an interest in historic illustrations
Year(s) Of Engagement Activity 2015
 
Description Talk on Lost Visions for Romantic Illustration Network 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Talk with questions and discussions

This activity was part of the dissemination of the archive to the people who will use it for their research. There was lots of positive feedback.
Year(s) Of Engagement Activity 2014