Living Virtually: Creating and Interfacing Digital Surrogates of Textual Data Embedded (Hidden) in Cultural Heritage Artefacts

Lead Research Organisation: University of Oxford
Department Name: Classics Faculty

Abstract

Since the discovery of the carbonised papyri at Herculaneum in the 18th century, there has been a great deal of interest in accessing the content contained in the scrolls preserved by the intense heat from the eruption of Mount Vesuvius in 79 CE. The first attempts to open these scrolls were made by hand using a knife, but this caused them to break into fragmented chunks. Subsequently in 1756 a machine was invented to create a safer method of unrolling, which was more successfully applied to numerous scrolls. However, in many cases it was impossible to keep the different layers of papyrus from sticking to each other, and so substantial portions of text remained hidden in even successfully opened scrolls, while hundreds of scrolls remained too firmly carbonised to unroll at all. The content of these fully intact scrolls, together with that of text under the stuck-on layers remains a mystery. New technology offers a solution. In the early 21st century the application of non-invasive CT scanning, a concept already proved by project members, reveals new possibilities. The structure of a scroll can be rendered digitally in three dimensions, revealing the layers of the papyrus in the scroll's circumference. Computational methods for algorithmically separating, unrolling, and flattening these layers have been developed by project members over the past decade. The virtual unrolling method has been successfully applied to P. Herc. 375 and 495. Nevertheless, despite such an achievement, the ink does not appear with any significant clarity. And while faint traces of a handful of Greek letters have been transcribed, there is currently no means to verify and replicate such results.
This project aims to address the problem of detecting ink in this non-invasive imaging and thus definitively solve the long-standing problem posed by the Herculaneum papyri. In 2016 project members successfully applied the virtual unrolling method to a carbonised Hebrew scroll from the site of Ein Gedi in Israel. The ink was immediately visible, but this was due to the fact that it was contaminated with heavy trace elements and thus naturally appeared in CT scanning. The carbon-based ink used in Herculaneum papyri cannot be visualised in the same way. However, we now know that the ink is weakly contaminated with lead. We thus propose a new method called Dark Field X-ray Imaging. This reveals ink by isolating and capturing trace elements, such as lead, in its composition. To enhance the resulting ink signal further we introduce a new neural network called Reference-Amplified Computed Tomography (RACT) to amplify both the ink's presence and the shapes of the Greek characters for improved legibility. This method will definitively solve the problem of reading the text hidden in the Herculaneum papyri. To add value, the project will make the data generated by this process accessible to researchers and the curators responsible for these artefacts, by developing a new digital platform, the Augmented Language Interface for Cultural Engagement (ALICE), ensuring that the data produced by the Dark Field X-ray Imaging and RACT processes is accessible, can be properly curated, and that the extracted text can be digitally edited. Moreover, ALICE includes the functionality for integrating 3D models of the original artefact and for recording the metadata that explains both how the text was created and from where in the object's geometry the text originates in the model generated along with its digital edition. This is necessary for scientifically verifying and replicating any subsequent analysis or publication of the data. Significantly, for other cultural heritage artefacts that contain hidden text, our new imaging techniques and digital platform will be built using open architecture standards; the source code will be easily adaptable for non-invasive reading of writing inside other intractable artefacts, such as burnt books, book-bindings, and mummy cartonnage.

Planned Impact

This is a project to develop new imaging and machine learning techniques to extract and make legible textual data embedded or hidden in damaged cultural heritage artefacts, in order to reveal the hidden content. We will focus on the carbonised papyri scrolls from Herculaneum, which will impact scholars in Classics, Papyrology, and Ancient Greek and Roman History and Philosophy by providing a significant body of new texts to advance their research. Other kinds of artefacts with hidden content will also be examined, and the data produced by the new techniques is applicable to other classes of material cultural artefacts. Poorly preserved and burnt book contain pages stuck together or spines that can no longer be opened without cracking and further deterioration. Book bindings contain hidden, cut-up pages of previously recycled books. The project will thus impact the curators of collections and the visitors and users (academics, as well as students, tourists and interested amateurs) to the museums and libraries that house them by enabling them to read these hitherto illegible artefacts virtually.

The project will facilitate the imaging and machine learning techniques developed by the project to other cultural heritage artefacts for the purpose of non-invasively extracting hidden textual data. Range of impact will be increased through the design and implementation of these new techniques according to open architecture standards. While the essential process of imaging and then virtually separating and revealing hidden text is the same, the physical object, language, and method of writing is not heterogeneous. The fundamental algorithms for virtually unrolling and extracting textual data will thus work with an updatable reference library to accommodate variation in types of physical object, language, and writing characters. The automated system contains its own established protocol for learning and adapting to new reference material. To demonstrate this transfer of knowledge in real time, our project documents and lays the groundwork for how the system may be transferred from Herculaneum papyri to other objects by publishing the results of the project and collaborating with curators in Bodleian Library, the Ashmolean Museum and the Chester Beatty Library to document how our project's methods are impacting their work and how they can be extended to other artefacts under their care.

As a model for interaction by visitors and users to a museum or library the system will enable them, through enhanced visualisation, to view and study artefacts that might otherwise not appear interesting or engaging. Herculaneum papyri scrolls look like burnt logs of wood. A book with pages stuck together might just as well be a closed book on display. Artefacts exhibiting texts of ancient, dead, or rare languages are likely to be impenetrable without an expert at hand to explain and contextualise their significance. To make the visitor experience more dynamic and informative, the project offers virtual and augmented reality applications via mobile technology tabulating use and thereby impact, in conjunction with project interviews with museum curators and staff (tour guides, conservators and department heads). Image data (in both 2D and 3D), video, audio, and translations, for example, bring to life a static burnt papyrus scroll or a volume that cannot be opened. The data produced will be contextualised by a digital platform through which curation, annotation, and digital editing is performed to make the extracted text accessible and useful for academic research and viewers' understanding. The system will translate the image data and the subsequently annotated and edited images into the formats required by mobile and wearable technology that makes the data ready for use in building augmented and virtual reality applications for virtual exhibition and visitor engagement.