Towards a National AI-Enabled Repository for Wales

Lead Research Organisation: Aberystwyth University
Department Name: Computer Science

Abstract

The project will develop a prototype distributed National trusted data repository (TDR) for Wales (NTDRW), which will have artificial intelligence (AI)-enabled functionality for knowledge discovery. The project will be developed in co-creation with stakeholders to assess readiness using appropriate toolsets and through a series of focus groups. These two aspects will concentrate on the following four TDR areas: data capture, data repositories, AI enabled search and knowledge discovery. The project will provide a detailed specification, including technical requirements, post-construction operational costs, and training requirements for the NTDRW.

One of the aims of the project is to bring the AHRC born-digital data community together. This is a collaborative project, which involves the following academic partners Aberystwyth University, Bangor University, Cardiff Metropolitan University, Cardiff University, Open University in Wales, Swansea University, and University of South Wales; and the following non-academic partners Archive and Records Council Wales, Cadw, Canolfan Beddwyr, the Digital Preservation Coalition, Eisteddfod Genedlaethol Cymru, the National Library of Wales, the Royal Commission on the Ancient and Historical Monuments Wales, and Welsh Government.
 
Description 5. Top 5 recommendations
5.1. Establishment of a bilingual AI-enhanced interoperable discovery layer for a trusted repository for Wales based on a distributed custody and storage model, enabling cross-search, knowledge discovery, metadata enhancement and linked data capabilities for digital arts and humanities data.
5.2. Development of an API for both display and re-ingest of AI-enhanced metadata. The DSpace APIs already provide a range of functions for querying metadata, but we will investigate whether these are sufficient or if additional APIs need to be implemented.
5.3. Development of a trusted storage environment for estray data, which is also available as additional storage capacity for member repositories.
5.4. Development of a training framework for the arts and humanities based on the FAIR Principles to include: data preparation and ingest, metadata generation, data protection legislation, intellectual property rights and open licencing, repository assessment frameworks, and AI functionality and ethics.
5.5. Establishment of an ethics board to develop an ethical governance framework and to oversee organisational and technical developments, including a framework for human involvement in training AI algorithms.
Exploitation Route These findings could be used for specific AI developments for Trusted Digital Repositories.
These findings could be used to further develop a National Trusted Digital Repository for Wales.
Sectors Communities and Social Services/Policy

Digital/Communication/Information Technologies (including Software)

Government

Democracy and Justice

Culture

Heritage

Museums and Collections

 
Description The results have been used to inform policy makers and end-users about the potential of AI enhancement for digital repositories.
First Year Of Impact 2022
Sector Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Government, Democracy and Justice,Culture, Heritage, Museums and Collections
Impact Types Cultural

Societal

Policy & public services