ARCHANGEL - Trusted Archives of Digital Public Records

Lead Research Organisation: University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP

Abstract

The aim of ARCHANGEL is to ensure the long-term sustainability of digital archives though the design, development and trialling of transformational new distributed ledger technology (DLT) to promote accessibility and ensure integrity of content, whilst maximising its impact through novel business models for commodification and open access.

Archives and Memory Institutions (AMIs) are the lens through which future generations will perceive today; they form the authoritative economic, social and cultural memory of a nation. For example, The National Archives (>15 petabytes) is one of the world's largest and oldest AMIs responsible for preserving the digital record of the UK Government e.g. key decisions made by Ministers and advice received. Some of this information is made open, some kept closed for decades. AMIs are founded upon the principle of public trust, of being neutral and completely trustworthy; the immutability and integrity of AMIs are essential to maintaining their objectivity. Yet world history is littered with examples where this objectivity has been compromised e.g. through expunging of physical records during times of political unrest. Today's digital age presents new socio-technical challenges to AMIs around safeguarding of data. Digital public records are intangible and so easy to remove or modify without that modification necessarily being detectable. Indeed in some cases records have to be modified to ensure their continued accessibility as formats change and the curation of data is also accompanied by the need to maintain associated code to render that data for presentation, often across decades. How should decisions over migration or prioritisation of maintenance be taken, or audited? What are the implications of migrations resulting in minor losses of fidelity one hundred years from now? How can the public be sure that digital content when released is fundamentally unaltered from the original? Existing archival practice is ill-equipped to respond to such issues, and is in urgent need of disruption to keep pace with our transformation into an increasingly digital society, so ensuring the integrity and impartiality of knowledge for future generations.

ARCHANGEL is a 18 month socio-technical feasibility study co-creating and evaluating a novel prototype DLT service with end-users to determine how archival practices, sustainable models and public attitudes could evolve in the presence of a trusted decentralised technology to prove content integrity and ensure open access to digital public archives. From a technological standpoint, ARCHANGEL will leverage cutting-edge machine learning to collect robust digital signatures derived from digitised physical, and born-digital content, within a permissioned DLT. Both signatures and programmatic code to render content and verify its provenance and integrity will be encoded within the DLT. Novel business models for sustaining the DLT e.g. via contributed effort (proof of work) will be explored at the points of creation and consumption using a cross-AMI model in which a single DLT is contributed to by multiple AMIs, across disciplines and nations, mitigating risk of archive distortion by its operating AMI. Impact is not limited to traditional AMIs, but any digital public archive: University research data repositories (linked to DOI); better management of corporate memory in multi-nationals (e.g. financial/regulatory compliance, managing records of prior art in tech companies).

To undertake this adventurous and ambitious project we have formed a strategic multi-disciplinary partnership uniting a world-leading group in multi-modal signal processing (CVSSP), the Centre for the Digital Economy (CODE) within Surrey Business School, and a consortium of AMI stakeholders including The National Archives and Tim Berners-Lee's Open Data Institute (ODI). The infrastructure will be developed with DLT platform provider Guardtime, and impact accelerated via Methods Digital.

Planned Impact

ARCHANGEL will transform the sustainability of digital public archives, delivering long-term horizontal impact across all sectors within the Digital Economy benefiting from enhanced integrity and accessibility of such archives. For example, archives of policy evidence (government and public services), research data (e.g. institutions working with medical or climate data), regulatory compliance data (finance, commerce), intellectual property data (patent filing bodies) and benefiting society more broadly through Archives and other Memory Institutions (AMIs) from cultural archives to the National Archives of government. Through partnership with the Open Data Institute (ODI), ARCHANGEL will engage many organisations, both public sector and commercial, as well as cross industry sector groups, in the building of the ARCHANGEL infrastructure to increase efficiency and generate more value from data, whether open, shared or closed. Decentralisation is an important factor in creating shared trustworthy data infrastructures, so this work will have applications in many areas across the economy.

ARCHANGEL will deliver vertical impact to specific sectors within the AMI landscape, driven through end-user partner The National Archives (TNA) - one of the world's largest and oldest archives, with >185km of physical records and capability to store >15 petabytes of digital content. TNA sets standards and accredits other archives through the Archive Service Accreditation scheme. ARCHANGEL will impact the record of government through TNA enabling dissemination of best practice to others (interest already received from two additional cultural archives) through TNA's leadership role. A founder member of the Digital Preservation Coalition (DPC), TNA is ideally placed to ensure ARCHANGEL's outcomes reach other AMIs beyond the archives sector such as the British Library and the National Records of Scotland through the DPC. ARCHANGEL is directly aligned with TNA's priorities which cite digital as its biggest strategic challenge, with the exponential growth of digital content (especially social media) ARCHANGEL is timely, coinciding with the deluge of government departments transferring digital records to TNA for long term preservation and the mandated reduction in timing for transfer. UK society is impacted more broadly through TNA's adoption since trusted records of government are needed to support policy development and assess their impact; to provide accountability for decisions; to share knowledge and to enable departments to provide accurate and comprehensive evidence to inquiries or legal actions. Partner Methods Digital frequently contracts with key government verticals and big business to drive digital transformation and will assist with exploitation and dissemination.

ARCHANGEL engages significantly with end-users throughout, adopting an open participatory, co-design process. Engagement begins with sandpits scoping the scenarios and real-world problems to address, following up with iterative live trials. We will organise a cross-disciplinary stakeholder workshop, and engage relevant RCUK networks (e.g. CREDIT), initiating a series of mini-projects to develop case studies showcasing ARCHANGEL in the wild to maximise impact and uptake of the research.

We are committed to public engagement which will see outcomes packaged into film-based deliverables as part of a schools and broader public outreach programme engaging the public in future-scoping workshops, further informing research objectives. Academic impact will be delivered through targeting top-tier internationally-focused venues in the multi-modal signal processing and secure systems space e.g. PAMI, ICCV, IJCAI, ACM TISSEC, TDSC. Of potentially high impact is the adaptation of Guardtime's global DLT platform to deliver an entirely novel DLT based infrastructure for trusted public digital archives. This could open new information assurance markets for Guardtime and DLT providers.

Publications

10 25 50
 
Description ARCHANGEL has developed a software solution based upon Blockchain technology that will help archivists and the general public improve their trust in digital archives. The software has been initially developed in the context of University research data archives but could generalise to any archive storing digital records. The software has been trialed with end-users in research data management drawn from the commercial and higher education sectors. This has led to new insights into the value proposition for ARCHANGEL technology in these use contexts and will inform the project as it moves forward beyond its initial six months to study other use cases based around digital content in the UK National Archives.
Exploitation Route The ARCHANGEL technical platform could be used by other forms of public digital archive to secure the integrity of their digital records, or exploited by commercial operators in this space to enhance their technology offerings around ensuring provenance of archive content. The insights into ARCHANGEL from a socio-technical perspective e.g. colelcted at focus groups and the workshop focusing on use case 1 of the project (research data archives) could inform policy regarding research data management at HEIs or wider cross-sector initiatives such as JISC RRDS.
Sectors Digital/Communication/Information Technologies (including Software),Education,Culture, Heritage, Museums and Collections,Security and Diplomacy

 
Description Six months into the project ARCHANGEL has developed a technical prototype of a Distributed Ledger (DLT or 'Blockchain') based solution for verifying the integrity of digital public archives. The prototype has focused on the first of the three 6 month use cases specified in the proposal - Research Data Archives (e.g. University research data archives). The prototype has been developed on the Ethereum (after exploring technical feasibility of several DLT platforms) and trialed with ~15 end-users from research data archives in a workshop held at the University of Surrey. Outcomes of this workshop and a description of the socio-technical considerations during its development have been disseminated at several high profile events in the digital archives world, and broader presentations on the project's initial outcomes made at events attended by government and the public sector. A further outcome of the project has been the construction of a private Ethereum DLT testbed at the University Surrey co-supporting other DLT based research projects at the institution. Challenges overcome to deliver this impact include co-development of the ARCHANGEL platform with real-world archivisits and overcoming technical hurdles due to the rapidly evolving space of DLT and AI both of which under-pin this project. These include researching appropriate content hashing algorithms and assessing the feasibility of various DLT architectures for ARCHANGEL.
First Year Of Impact 2017
Sector Digital/Communication/Information Technologies (including Software),Education,Government, Democracy and Justice,Culture, Heritage, Museums and Collections,Security and Diplomacy
Impact Types Societal,Policy & public services

 
Title Ethereum based ARCHANGEL platform technical prototype 
Description A prototype of the ARCHANGEL technical platform was developed across the initial six months of the project, enabling the integrity of digital content within archives to be verified. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2018 
Impact The technology prototype enabled a workshop to run in March 2018 with attendees from the commercial and higher education sectors to try out the technology and to identify socio-technical research questions that the project should address around the platform going forward. 
 
Description Blog post by Open Data Institute on the ARCHANGEL concept and project progress 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Jamie Fawcett and the Open Data Institute authored a blog post on their ODI Research site covering the ARCHANGEL project technical architecture and concept for digital preservation. Initial project findings were reported at a high level for the general public audience.
Year(s) Of Engagement Activity 2018
 
Description Presentation about Digital Archives of the Future 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Description: John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a public lecture hosted by The National Archives about digital archives of the future, looking ahead 40 years. The event was part of The National Archives public engagement activities. Work done in the Archangel project was of particular interest in the question and answer session.
Year(s) Of Engagement Activity 2018
URL https://www.eventbrite.co.uk/e/digital-archives-of-the-future-tickets-40906662930
 
Description Presentation to DLM Forum Triennial Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about the future of digital archiving to an audience of digital archivists from European archives. Location: University of Brighton. Audience: Digital archivists, primarily from European National Archives.
Year(s) Of Engagement Activity 2017
URL http://www.dlmforum.eu/index.php/home/all-events/76-dlm-forum-brighton-triennial-arma-international-...
 
Description Presentation to PASIG (Preservation and Archiving Special Interest Group) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about the future of digital archiving to an audience of digital preservation practitioners and software vendors from around the world
Year(s) Of Engagement Activity 2017
URL https://pasigoxford.org/
 
Description Presentation to archival studies students from the University of Liverpool about Digital Archiving 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation to archival studies students from the University of Liverpool about digital archiving. Location: The National Archives, Kew
Year(s) Of Engagement Activity 2018
 
Description Presentation to the Information Studies Symposium (ISS) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about digital archiving to an audience of academics from around the world involved in the Information Studies subject area.
Year(s) Of Engagement Activity 2017
 
Description University of Salford Blockchain Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Jamie Fawcett carried out a 25 minute talk on blockchain applications and spoke about the ARCHANGEL project as a case study. He spoke through the aims of the project and explained how the approach was tackling challenges faced by archivists, and also the technical/practical challenges already identified in the project. He briefly discussed the development of the first prototype.
Year(s) Of Engagement Activity 2018