Mapping Manuscript Migrations: Digging into data for the history and provenance of pre-modern European manuscripts

Lead Research Organisation: University of Oxford
Department Name: Engineering Science

Abstract

Hundreds of thousands of European pre-modern manuscripts have survived until the present day. As the result of changes in their ownership over the centuries, they are now spread all over the world. Collectively they constitute a great cultural and scholarly treasure. There are many sources of data relating to them, and new sources continue to proliferate in the digital environment. This project will link disparate datasets from Europe and North America to provide an international view of the history and provenance of these manuscripts. The aggregated data will enable researchers to analyse and visualize these topics at scales ranging from individual manuscripts to thousands of manuscripts. We will be able to show how these manuscripts have travelled across time and space to their current locations, where they continue to find new audiences.

The project will also be of particular relevance and value to libraries and other collecting institutions. The results of its analyses will situate their manuscript collections in the broader historical context of patterns and trends in collecting, while its methodology and its body of data will provide a very important resource for further aggregation and exploration in the future. The data linkage techniques and visualization methodologies deployed by the project will be of wider applicability to all kinds of cultural heritage objects and collections as well as manuscripts.

Planned Impact

The target audiences for the work of the Project are: manuscript researchers, collection and data custodians in libraries and museums, the digital humanities community, and the Linked Data community. The benefits to these groups will include: integrated and sophisticated access to data about manuscripts, with the potential to carry out their own research; new large-scale analyses and visualizations of manuscript histories; and a working environment for Linked Data in medieval and Renaissance studies, which can be built on in the future. Manuscript researchers and data custodians will be able to use the visualizations as the basis for outreach and dissemination of their work and information about their collections to a wider community audience.

The Project will engage with these target audiences through the connections of the Project Team members and through the following communication channels. A detailed Outreach Plan will be developed in the first stage of the Project. The Project will establish its own Web site to report progress, discuss issues, and link to datasets and data products. The site will include a blog, to which all members of the Project Team will be able to contribute. The Project will also establish a Twitter account which will leverage the existing Twitter accounts of Project Team members for re-tweets and recruiting followers, as well as followers of @DiggingIntoData. Videos explaining the work of the Project will be posted to the Schoenberg Institute's YouTube channel.

Training for researchers in the digital humanities and manuscript studies will take place in 2018/9 through workshops and short courses organized by the Schoenberg Institute, the IRHT, and the Digital Humanities Summer School at Oxford University. Workshops will also be offered for librarians and curators through professional bodies. Mentoring will be embedded in the Project through the employment of postdoctoral researchers. Members of the Project Team will present reports on progress at conferences relevant to medieval and Renaissance studies (Medieval Academy, Renaissance Society, ICMS at Kalamazoo and Leeds), digital humanities (international and national conferences), Linked Data and the Semantic Web (ESWC), and library and museum conferences. Members of the Project Team will submit articles for publication in refereed journals across the range of disciplines covered by the Project: manuscript studies, medieval and Renaissance studies, digital humanities, library and museum practice and collecting, and Linked Data and Semantic Web research.

Publications

10 25 50
 
Title Conversion of TEI-XML documents to RDF Linked Data 
Description Descriptions of medieval manuscripts in the Bodleian Library and other Oxford libraries are made available as XML documents encoded in accordance with the TEI (Text Encoding Initiative) Guidelines for manuscript descriptions. As part of the Mapping Manuscript Migrations project, we have developed a pipeline for converting these XML documents into RDF triples which can be ingested into a Linked Data triple store. This pipeline involves extracting relevant portions of the XML documents, converting these extracts into a single XML document, mapping this document to RDF triples using a Unified Data Model developed by the project, and uploading the RDF triples to the project's triple store. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact We are just beginning the process of publicizing this method to the community of manuscript librarians and researchers. 
URL https://github.com/bodleian/medieval-mss
 
Title Mapping Manuscript Migrations triple store and Web service 
Description This database combines data from three existing databases (Schoenberg Database of Manuscripts, Bibale, and Medieval Manuscripts in Oxford Libraries) into a unified RDF Triple Store. Outputs from the three databases are transformed into RDF using a Unified Data Model based on two published ontologies which are widely deployed in the cultural heritage knowledge sector: CIDOC-CRM and FRBR. Entities referenced in each database (persons, places, organizations) are being reconciled and matched using Linked Open Data services like the Getty Thesaurus of Geographical Names and VIAF. A user interface for visualizing and exploring the combined data is currently being developed. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact This database is still in development. 
 
Title Unified Data Model for manuscript provenance data 
Description This Unified Data Model has been developed using elements from the CIDOC-CRM and FRBR ontologies. The model is used to aggregate data from three different manuscript databases and transform them into RDF Linked Data. 
Type Of Material Data handling & control 
Year Produced 2019 
Provided To Others? Yes  
Impact This model is still being refined. We have begun the process of publicizing the model at relevant conferences and workshops. 
 
Description Mapping Manuscript Migrations 
Organisation Aalto University
Country Finland 
Sector Academic/University 
PI Contribution The research team at the Oxford e-Research Centre is contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling) and manuscript studies as well as project coordination and conceptualization.
Collaborator Contribution The other partners are contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling), database design and manuscript studies.
Impact Multi-disciplinary: historical studies, library and information science, computer science
Start Year 2017
 
Description Mapping Manuscript Migrations 
Organisation Research Institute and History of Texts
PI Contribution The research team at the Oxford e-Research Centre is contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling) and manuscript studies as well as project coordination and conceptualization.
Collaborator Contribution The other partners are contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling), database design and manuscript studies.
Impact Multi-disciplinary: historical studies, library and information science, computer science
Start Year 2017
 
Description Mapping Manuscript Migrations 
Organisation University of Pennsylvania
Country United States 
Sector Academic/University 
PI Contribution The research team at the Oxford e-Research Centre is contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling) and manuscript studies as well as project coordination and conceptualization.
Collaborator Contribution The other partners are contributing: specialist knowledge in computer science (Linked Data, ontologies, data modelling), database design and manuscript studies.
Impact Multi-disciplinary: historical studies, library and information science, computer science
Start Year 2017
 
Description Focus Group for manuscript researchers 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact A group of manuscript researchers, drawn primarily from the Oxford area, was convened to discuss their requirements for a digital discovery service which would improve their access to data about the history and provenance of medieval and Renaissance manuscripts. Many of the participants provided useful feedback for the project in relation to the requirements for designing such a service and their research questions and interests. Most of the participants subsequently followed the project on Twitter.
Year(s) Of Engagement Activity 2017