ARCHANGEL - Trusted Archives of Digital Public Records

Lead Research Organisation: University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP

Abstract

The aim of ARCHANGEL is to ensure the long-term sustainability of digital archives though the design, development and trialling of transformational new distributed ledger technology (DLT) to promote accessibility and ensure integrity of content, whilst maximising its impact through novel business models for commodification and open access.

Archives and Memory Institutions (AMIs) are the lens through which future generations will perceive today; they form the authoritative economic, social and cultural memory of a nation. For example, The National Archives (>15 petabytes) is one of the world's largest and oldest AMIs responsible for preserving the digital record of the UK Government e.g. key decisions made by Ministers and advice received. Some of this information is made open, some kept closed for decades. AMIs are founded upon the principle of public trust, of being neutral and completely trustworthy; the immutability and integrity of AMIs are essential to maintaining their objectivity. Yet world history is littered with examples where this objectivity has been compromised e.g. through expunging of physical records during times of political unrest. Today's digital age presents new socio-technical challenges to AMIs around safeguarding of data. Digital public records are intangible and so easy to remove or modify without that modification necessarily being detectable. Indeed in some cases records have to be modified to ensure their continued accessibility as formats change and the curation of data is also accompanied by the need to maintain associated code to render that data for presentation, often across decades. How should decisions over migration or prioritisation of maintenance be taken, or audited? What are the implications of migrations resulting in minor losses of fidelity one hundred years from now? How can the public be sure that digital content when released is fundamentally unaltered from the original? Existing archival practice is ill-equipped to respond to such issues, and is in urgent need of disruption to keep pace with our transformation into an increasingly digital society, so ensuring the integrity and impartiality of knowledge for future generations.

ARCHANGEL is a 18 month socio-technical feasibility study co-creating and evaluating a novel prototype DLT service with end-users to determine how archival practices, sustainable models and public attitudes could evolve in the presence of a trusted decentralised technology to prove content integrity and ensure open access to digital public archives. From a technological standpoint, ARCHANGEL will leverage cutting-edge machine learning to collect robust digital signatures derived from digitised physical, and born-digital content, within a permissioned DLT. Both signatures and programmatic code to render content and verify its provenance and integrity will be encoded within the DLT. Novel business models for sustaining the DLT e.g. via contributed effort (proof of work) will be explored at the points of creation and consumption using a cross-AMI model in which a single DLT is contributed to by multiple AMIs, across disciplines and nations, mitigating risk of archive distortion by its operating AMI. Impact is not limited to traditional AMIs, but any digital public archive: University research data repositories (linked to DOI); better management of corporate memory in multi-nationals (e.g. financial/regulatory compliance, managing records of prior art in tech companies).

To undertake this adventurous and ambitious project we have formed a strategic multi-disciplinary partnership uniting a world-leading group in multi-modal signal processing (CVSSP), the Centre for the Digital Economy (CODE) within Surrey Business School, and a consortium of AMI stakeholders including The National Archives and Tim Berners-Lee's Open Data Institute (ODI). The infrastructure will be developed with DLT platform provider Guardtime, and impact accelerated via Methods Digital.

Planned Impact

ARCHANGEL will transform the sustainability of digital public archives, delivering long-term horizontal impact across all sectors within the Digital Economy benefiting from enhanced integrity and accessibility of such archives. For example, archives of policy evidence (government and public services), research data (e.g. institutions working with medical or climate data), regulatory compliance data (finance, commerce), intellectual property data (patent filing bodies) and benefiting society more broadly through Archives and other Memory Institutions (AMIs) from cultural archives to the National Archives of government. Through partnership with the Open Data Institute (ODI), ARCHANGEL will engage many organisations, both public sector and commercial, as well as cross industry sector groups, in the building of the ARCHANGEL infrastructure to increase efficiency and generate more value from data, whether open, shared or closed. Decentralisation is an important factor in creating shared trustworthy data infrastructures, so this work will have applications in many areas across the economy.

ARCHANGEL will deliver vertical impact to specific sectors within the AMI landscape, driven through end-user partner The National Archives (TNA) - one of the world's largest and oldest archives, with >185km of physical records and capability to store >15 petabytes of digital content. TNA sets standards and accredits other archives through the Archive Service Accreditation scheme. ARCHANGEL will impact the record of government through TNA enabling dissemination of best practice to others (interest already received from two additional cultural archives) through TNA's leadership role. A founder member of the Digital Preservation Coalition (DPC), TNA is ideally placed to ensure ARCHANGEL's outcomes reach other AMIs beyond the archives sector such as the British Library and the National Records of Scotland through the DPC. ARCHANGEL is directly aligned with TNA's priorities which cite digital as its biggest strategic challenge, with the exponential growth of digital content (especially social media) ARCHANGEL is timely, coinciding with the deluge of government departments transferring digital records to TNA for long term preservation and the mandated reduction in timing for transfer. UK society is impacted more broadly through TNA's adoption since trusted records of government are needed to support policy development and assess their impact; to provide accountability for decisions; to share knowledge and to enable departments to provide accurate and comprehensive evidence to inquiries or legal actions. Partner Methods Digital frequently contracts with key government verticals and big business to drive digital transformation and will assist with exploitation and dissemination.

ARCHANGEL engages significantly with end-users throughout, adopting an open participatory, co-design process. Engagement begins with sandpits scoping the scenarios and real-world problems to address, following up with iterative live trials. We will organise a cross-disciplinary stakeholder workshop, and engage relevant RCUK networks (e.g. CREDIT), initiating a series of mini-projects to develop case studies showcasing ARCHANGEL in the wild to maximise impact and uptake of the research.

We are committed to public engagement which will see outcomes packaged into film-based deliverables as part of a schools and broader public outreach programme engaging the public in future-scoping workshops, further informing research objectives. Academic impact will be delivered through targeting top-tier internationally-focused venues in the multi-modal signal processing and secure systems space e.g. PAMI, ICCV, IJCAI, ACM TISSEC, TDSC. Of potentially high impact is the adaptation of Guardtime's global DLT platform to deliver an entirely novel DLT based infrastructure for trusted public digital archives. This could open new information assurance markets for Guardtime and DLT providers.

Publications

10 25 50
 
Description ARCHANGEL has developed a software solution based upon Blockchain technology that will help archivists and the general public improve their trust in digital archives. The software has been initially developed in the context of University research data archives and was later extended to safeguard digital video records, but could generalise to any archive storing any kind of digital records. The software has been trialed with end-users in research data management drawn from the commercial and higher education sectors and with national archives in a pilot deployment across multiple international government archives. This has led to new insights into the value proposition for ARCHANGEL technology in these use contexts and will inform the project as it moves forward to tackle its final use case around other kinds of digital content in the UK National Archives.
Exploitation Route The ARCHANGEL technical platform could be used by other forms of public digital archive to secure the integrity of their digital records, or exploited by commercial operators in this space to enhance their technology offerings around ensuring provenance of archive content. The insights into ARCHANGEL from a socio-technical perspective e.g. colelcted at focus groups and the workshop focusing on use case 1 of the project (research data archives) could inform policy regarding research data management at HEIs or wider cross-sector initiatives such as JISC RRDS.
Sectors Digital/Communication/Information Technologies (including Software),Education,Culture, Heritage, Museums and Collections,Security and Diplomacy

 
Description One year into the project ARCHANGEL has developed two technical prototypes of a Distributed Ledger (DLT or 'Blockchain') based solution for verifying the integrity of digital public archives. The first prototype focused on the first of the three 6 month use cases specified in the proposal - Research Data Archives (e.g. University research data archives). The prototype has been developed on the Ethereum (after exploring technical feasibility of several DLT platforms) and trialed with ~15 end-users from research data archives in a workshop held at the University of Surrey. Outcomes of this workshop and a description of the socio-technical considerations during its development have been disseminated at several high profile events in the digital archives world, and broader presentations on the project's initial outcomes made at events attended by government and the public sector. This resulted in a high profile publication at ACM Doc Eng (technical) and at the iPRES conference (archival practice) in 2018. These activities and papers produced substantial national media interest including mention of the project as an example of successful public sector-academic collaboration on DLT in Margot James MP (DCMS) speech at Blockchain Live 2018. The prototype was subsequently extended to cover the second use case on the project - securing digital video within government national archives. A new AI technique was built to detect tampering in digital video and could for example be used to guard against deepfakes (AI manipulated fake videos on the internet). At the time of submission the project is awaiting outcome of publication review of a paper on this topic at CVPR the premier computer vision conference. We live trialed the work at the National Archives in the UK are about to go live on a small-scale international trial of the ARCHANGEL platform across the national archives of UK, Norway and Estonia. Challenges overcome to deliver this impact include co-development of the ARCHANGEL platform with real-world archivisits and overcoming technical hurdles due to the rapidly evolving space of DLT and AI both of which under-pin this project. These include researching appropriate content hashing algorithms and assessing the feasibility of various DLT architectures for ARCHANGEL. A further outcome of the project has been the construction of a private Ethereum DLT testbed at the University Surrey co-supporting other DLT based research projects at the institution.
First Year Of Impact 2017
Sector Digital/Communication/Information Technologies (including Software),Education,Government, Democracy and Justice,Culture, Heritage, Museums and Collections,Security and Diplomacy
Impact Types Societal,Policy & public services

 
Description Lord Holmes Roundtable on Distributed Ledger Technology for the Public Good
Geographic Reach National 
Policy Influence Type Gave evidence to a government review
Impact Led to co-founding of several working groups with attendees of the roundtable e.g. with HMRC and DWP around use of Distributed Ledger Technology in government.
 
Description Blockstart: Blockchain-based applications for SME competitiveness (BSTART)
Amount € 4,000,000 (EUR)
Funding ID NWE 870 
Organisation INTERREG IIIC North 
Sector Public
Country France
Start 04/2019 
End 01/2022
 
Title Ethereum based ARCHANGEL platform technical prototype 
Description A prototype of the ARCHANGEL technical platform was developed across the initial six months of the project, enabling the integrity of digital content within archives to be verified. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2018 
Impact The technology prototype enabled a workshop to run in March 2018 with attendees from the commercial and higher education sectors to try out the technology and to identify socio-technical research questions that the project should address around the platform going forward. 
 
Description ACCU workshop on The Very Slow Time Machine 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Jez Higgins (ODI) delivered a workshop titled 'The Very Slow Time Machine' on best software practice around engineering blockchain based solutions for digital archiving.
Year(s) Of Engagement Activity 2019
URL https://skillsmatter.com/meetups/12030-accu-london-the-very-slow-time-machine
 
Description Blog post by Open Data Institute on the ARCHANGEL concept and project progress 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Jamie Fawcett and the Open Data Institute authored a blog post on their ODI Research site covering the ARCHANGEL project technical architecture and concept for digital preservation. Initial project findings were reported at a high level for the general public audience.
Year(s) Of Engagement Activity 2018
 
Description Keynote at Archives and AI - FAN 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Keynote presentation on ARCHANGEL outcomes at FAN 2018 - workshop for leaders of national archives from various nation states
Year(s) Of Engagement Activity 2018
 
Description NATO HQ - Meeting of the NATO Archives Committee on Machine Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Mark Bell the National Archives gave talk on 11 December 2018 at Meeting of the NATO Archives Committee on Machine Learning disseminating best practice around AI and DLT used for secure archiving in ARCHANGEL
Year(s) Of Engagement Activity 2018
 
Description ODI Blog post: Archives and Blockchain 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Public outreach blog post by the Open Data institute on early outcomes of the ARCHANGEL project and more broadly Blockchain's potential role in the future of archiving
Year(s) Of Engagement Activity 2018
URL https://theodi.org/article/blockchains-potential-role-in-the-future-of-archiving/
 
Description ODI Blog post: Challenges in using blockchains to build trust in digital archiving 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact ODI Blog post on Challenges in using blockchains to build trust in digital archiving targetted at non-expert public / general science dissemination
Year(s) Of Engagement Activity 2019
URL https://theodi.org/article/challenges-in-using-blockchain-to-build-trust-in-digital-archiving/
 
Description Participation on DLT/Identity for Gig Economy Workshop at DWP 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Invited participant at DWP hosted government-industry-academic workshop on DLT for self-soverigen identity in the context of the Gig Economy.
Year(s) Of Engagement Activity 2019
 
Description Peter Wall Institute for Advanced Studies International Research Roundtable 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Collomosse invited to deliver keynote and participate expert panel at University of British Columbia in Vancouver, to talk about ARCHANGEL technologies and impact on archival practice
Year(s) Of Engagement Activity 2019
 
Description Presentation about Digital Archives of the Future 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Description: John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a public lecture hosted by The National Archives about digital archives of the future, looking ahead 40 years. The event was part of The National Archives public engagement activities. Work done in the Archangel project was of particular interest in the question and answer session.
Year(s) Of Engagement Activity 2018
URL https://www.eventbrite.co.uk/e/digital-archives-of-the-future-tickets-40906662930
 
Description Presentation to DLM Forum Triennial Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about the future of digital archiving to an audience of digital archivists from European archives. Location: University of Brighton. Audience: Digital archivists, primarily from European National Archives.
Year(s) Of Engagement Activity 2017
URL http://www.dlmforum.eu/index.php/home/all-events/76-dlm-forum-brighton-triennial-arma-international-...
 
Description Presentation to PASIG (Preservation and Archiving Special Interest Group) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about the future of digital archiving to an audience of digital preservation practitioners and software vendors from around the world
Year(s) Of Engagement Activity 2017
URL https://pasigoxford.org/
 
Description Presentation to archival studies students from the University of Liverpool about Digital Archiving 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation to archival studies students from the University of Liverpool about digital archiving. Location: The National Archives, Kew
Year(s) Of Engagement Activity 2018
 
Description Presentation to the Information Studies Symposium (ISS) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, the Digital Director at The National Archives, talked about the Archangel project, as part of a presentation about digital archiving to an audience of academics from around the world involved in the Information Studies subject area.
Year(s) Of Engagement Activity 2017
 
Description Presented ARCHANGEL at EPSRC / Digital Economy 10 year anniversary event in London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact ARCHANGEL selected as a highlight of 10 years of Digital Economy funded projects. John Collomoss delivered flash presentation and demo of ARCHANGEL to disseminate outcomes and potential of AI/DLT for secure digital recordkeeping.
Year(s) Of Engagement Activity 2019
 
Description Presented mid-term review of ARCHANGEL outcomes at National Cyber Security Centre 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Presented outcomes of ARCHANGEL at mid-term review organised by John Baird/EPSRC at National Cyber Security Centre in London. Government HMRC/DCMS/Security Services in attendance as were other academics/recipients of TIPS and ADLT funding.
Year(s) Of Engagement Activity 2018
 
Description Presented talk on AI and DLT in ARCHANGEL to UK Information Management Liason Group (IMLG) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact John Collomosse gave keynote at the IMLG 2018 conference hosted at the National Archives. Informing archives and other UK public sector / government departments on potential for AI and DLT via ARCHANGEL outcomes for tamper-proof data.
Year(s) Of Engagement Activity 2018
 
Description Software engineering workshop and technical meeting on Blockchain for Archives 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Technical workshop organised by Jez Higgins (ODI) to disseminate best software engineering practice around Blockchain for archives
Year(s) Of Engagement Activity 2018
URL https://www.meetup.com/meetup-group-MzfqIqCy/events/fppxlqyzdbzb/
 
Description University of Salford Blockchain Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Jamie Fawcett carried out a 25 minute talk on blockchain applications and spoke about the ARCHANGEL project as a case study. He spoke through the aims of the project and explained how the approach was tackling challenges faced by archivists, and also the technical/practical challenges already identified in the project. He briefly discussed the development of the first prototype.
Year(s) Of Engagement Activity 2018