Digital approaches to the capture and analysis of watermarks using the manuscripts of Isaac Newton as a test case

Lead Research Organisation: University of Cambridge
Department Name: History

Abstract

This project will investigate two research areas with general application in digital humanities scholarship, using the dispersed manuscript corpus of Isaac Newton as a test case. The immediate purpose of the test case will be to use artificial intelligence to assist with the identification and classification of watermarks in Newton material and, in the process, to build a general tool to assist with the organisation and dating of manuscripts. The project also has much wider significance. The project's first stage will be the methodological investigation of techniques for the production of images of watermarks which are suitable for automated analysis, using both new photography and the exploration of the potential latent in existing images. During the second stage, we will develop computer vision methods to systematically cluster and match the assembled corpus of watermark images across manuscripts and collections. Methods developed through this project will be transferrable to watermark collections beyond that of Newton's corpus, creating a methodology for scholars seeking to analyse, date, and organise historical collections via watermark matching, and for conservators seeking to establish standardised surveying and documentation methods while imaging and digitising watermarked documents. A final stage of the project will allow us to disseminate our findings through research workshops, web tools, and improvements to online databases, as well as traditional publications in journals.
Since the groundbreaking early twentieth-century research of Charles Moïse Briquet, watermarks have formed a central part in the dating of otherwise undated manuscripts. Briquet's monumental 1907 catalogue, Les filigranes, made it possible, in principle, to date (and to some extent localise) pre-1600 watermarks found by researchers in manuscripts by reference to exemplars in Briquet's catalogue. While this catalogue and others have been digitised thanks to the Bernstein consortium (https://memoryofpaper.eu/), advances in research and technology have revealed the limitations of the traditional approach, which requires time-consuming procedures and some degree of expertise for the identification of each single watermark. It is very difficult to find exact matches between watermarks in situ and those reproduced in any catalogue, first due to the limited comprehensiveness of the catalogues, and, second, because each individual watermark is produced in two "twin" versions, never perfectly identical, and suffers deformation over time as a result of repeated use in the paper manufacturing process. By developing and enhancing new approaches and techniques to improve the acquisition and analysis of watermarks, we hope to solve basic problems and thereby provide benefit to all who must rely upon paper documents for chronological evidence.
While computer vision has made significant progress in recent years thanks to machine learning and artificial intelligence, this project will build on cutting-edge work already undertaken by the Ecole Nationale des Chartes and its partners (notably the computer scientists at École des Ponts ParisTech) to investigate the problem of matching images, specifically of watermarks, across formats (photographs and tracings). In creating a corpus of images used to train and develop the open source software created by the Ecole des Chartes we will build on recent work by The National Archives (TNA) to use comparatively affordable equipment and techniques to produce images of watermarks that are highly suitable for machine analysis. The project will develop and apply both of these approaches in order to attempt to enhance the computer-vision software so that it may be able to unlock the latent information held in thousands of existing images shot in reflected light which institutions have already digitised and made accessible through IIIF.

Publications

10 25 50
publication icon
Mandelbrote, S. (2022) The Newton Project in European Mathematical Society Magazine

publication icon
Voelkel, J. R. (2021) Chasing the Clues in Isaac Newton's Manuscripts in Distillations: Science History Institute

 
Description The intention is to extend the award period by a year because of delays caused in the start of the research and imaging by Covid.
Nevertheless some key findings are already reportable:
1. It will be possible to produce imaging guidelines for the digitisation of watermarks, and to specify imaging behaviour for a variety of levels of photographic equipment. Following these guidelines will enable the production of useful images for computer analysis. Work is ongoing to produce such guidelines.
2. It is currently possible to train a machine-learning visual recognition tool to group watermarks by essential features to a level of approximately 70% accuracy. Enhanced training on a larger data set should allow the development of a tool with significantly enhanced accuracy. We have built an environment for such training and will be trying to improve the accuracy of recognition and to hone the factors used for recognition. This may require additional funding or collaborations within the extended term of the award.
3. Extensive imaging of paper used by Isaac Newton has been completed (at the University of Cambridge, the National Libraries, and the Huntington Library). These images will be made available in simple form in the Cambridge Digital Library gallery of images associated with the Newton Project, combined with transcriptions and associated metadata, or in the Chymistry of Isaac Newton pages hosted by the University of Indiana. A first batch should be online in late March. The majority of the imaging has however been to provide data on which to train the visual recognition tool: its use is therefore ongoing, deploying supercomputing facilities at the University of Indiana. Collaboration with the Bodmer Library (Bodmer Lab) and the National Library of Israel has generated many new, lower quality images of watermarks. Taken together with the images prepared for artificial recognition, these images allow extensive manual scholarly work to group watermarks within Newton's archive. That work is currently ongoing and should provide an initial restructuring of the chronology of Newton's writings, pending the full development of the visual recognition tool.
4. Key collaborations have been established among repositories holding papers from Newton's archive on three continents and between scholars analysing the archive.
Exploitation Route We plan to publish imaging guidelines which will provide guidance, prepared by the Cambridge University Library and the National Archives, which will be available to other institutions and individuals to use. (Timeline: to August 2024) This will allow the development of best practice and the planned acquisition of appropriate equipment to library imaging services elsewhere.
We are not yet able to make our visual recognition tool public but hope to be able to do so, either through the University of Cambridge or the University of Indiana, in due course. Once public it will be able to upload and categorise new images supplied in an appropriate form by any user.
It should be possible to incorporate information about the dates of Newton's archive derived from the research of the project into the publicly available databases of the Newton Project and the Chymistry of Isaac Newton. This will have implications for all future scholarship on Newton's writings.
We will have held a closed meeting reporting progress and discussing future progress towards outcomes with interested parties in Cambridge in late March 2023; extension of the award to August 2024 will allow us to make a public presentation of more complete findings at the Huntington Library in January 2024.
Sectors Education,Culture, Heritage, Museums and Collections

URL https://live-events.nli.org.il/events/newton-watermark-project-yahuda-collection?doculang=false
 
Description Findings have been used to develop best practice in the imaging services of the Cambridge University Library, the National Archives, and the Huntington Library, San Marino, California. They have influenced the choices made by the Bodmer Lab (a collaboration between the University of Geneva and the Bodmer Library, Coligny, Switzerland) in the digitisation, encoding for TEI, and study of the Bodmer Newton manuscript and been communicated also to the National Library of Israel (which together with the Cambridge University Library holds the UNESCO World Heritage listing for the papers of Isaac Newton).
First Year Of Impact 2022
Sector Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description Research and Collections Programme
Amount £3,800 (GBP)
Organisation University of Cambridge 
Department Cambridge University Library
Sector Academic/University
Country United Kingdom
Start 03/2022 
End 10/2022
 
Description Huntington Library 
Organisation Huntington Library
Country United States 
Sector Academic/University 
PI Contribution Provision of imaging protocols and advice; provision of database advice
Collaborator Contribution Provision of images; development of imaging protocols; database content
Impact None as yet
Start Year 2021
 
Description Indiana University 
Organisation Indiana University Bloomington
Country United States 
Sector Academic/University 
PI Contribution We have begun photographing images that will be used as a dataset for computer analysis using supercomputing facilities at Indiana. We have begun working together on a database structure.
Collaborator Contribution Provision of supercomputing facilities. Provision of descriptions of manuscripts to be imaged.
Impact None as yet
Start Year 2021
 
Description King's College, Cambridge 
Organisation University of Cambridge
Department King's College Cambridge
Country United Kingdom 
Sector Academic/University 
PI Contribution Selection of manuscript items for imaging and for future inclusion in the data set for computer analysis
Collaborator Contribution Provision of manuscript items, preparation of items for future photography
Impact None so far
Start Year 2021
 
Description National Library of Israel 
Organisation National Library of Israel
Country Israel 
Sector Public 
PI Contribution We delivered four one-hour talks as part of an online education programme hosted by the National Library of Israel. Talks were attended by 30-50 participants on each occasion.
Collaborator Contribution National Library of Israel staff (head of western special collections; project manager, education and culture) participated throughout the talks and provided the infrastructure for them. A follow-up visit to Cambridge was made by the head of western special collections.
Impact The outputs are the four talks available via the URL. The disciplines involved are history/ history of science, archive and collection management, and digital humanities.
Start Year 2022
 
Description The National Archives 
Organisation The National Archives
Country United Kingdom 
Sector Public 
PI Contribution Collaboration in production of imaging guidelines and selection of material to image; coding and identification of items for imaging.
Collaborator Contribution Production of draft imaging guidelines, technical assistance with establishing imaging equipment at the University Library; technical assistance with job search for post-doctoral appointment; selection and imaging of materials held by the National Archives.
Impact History, History of Science, Digital Humanities, Archive Sciences
Start Year 2021
 
Description Lectures on the Newton Watermark Project at the National Library of Israel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Four public lectures (December 2022-March 2023) on aspects of the Watermarks project; follow-up visit to Cambridge by head of western special collections at the National Library of Israel.
Year(s) Of Engagement Activity 2022,2023
URL https://live-events.nli.org.il/events/newton-watermark-project-yahuda-collection?doculang