Content Models for Enhancement and Sustainability (CMES)

Lead Research Organisation: King's College London
Department Name: Information Services and Systems

Abstract

The project will develop an extensible framework for representing and managing broad categories of complex digital collections produced by arts and humanities research, and a methodology for extending this framework. By creating open, generic (yet flexible) solutions, the project aims to maximise the sustainability of digital resources through technological change, and to facilitate the re-use of the materials in innovative ways that neither we nor the resource developers could anticipate.

The project will achieve this within the context of the Fedora digital repository software. Fedora has the concept of 'content models', which may be regarded as 'data types' for digital objects. We will exploit extensions to this concept to create 'content patterns' that serve as templates for creating, managing, sustaining, exposing and re-using digital collections that conform to the pattern.

We will develop these content models for two specific groups of collections - (i) digital texts, and (ii) multi-media performing arts collections - and we will use them to model a set of target collections within a digital repository instance that we will maintain after project completion. We will thus contribute to the enhancement and sustainability of these collections; however, we must emphasise that we are not developing ad hoc solutions for these resources, but rather content models and software that can be used for other collections of analogous type, as well as a methodology for extending our approach to other types of digital resource.

Planned Impact

Apart from the benefits gained in academic environments, the project will have broader impact in the non-academic world:
- Performing arts practitioners with an interest in researching theatre design and production (who would find the Adolphe Appia collection of great value) or in Scottish traditional performance.
- Performing arts practitioners with an interest in long-term sustainability of digital recordings and in modelling the structure(s) of performances.
- Institutions responsible for managing digital material, in particular complex material that goes beyond the 'standard' material (such as pre-prints) currently found in many repositories, for example research data and outputs from digitisation projects. These institutions will include higher education, but will potentially include many non-academic bodies, such as: libraries, archives, museums and other memory organisations; media, cultural and creative industries; commercial companies carrying out scientific research; hospitals/clinics (note that digital object types and research activities that occur in the arts and humanities may have their analogues in the sciences, e.g. annotation of images/video in medical research and practice).
- Research Funders, who will benefit from increased visibility, sustainability and re-use of the research that they fund. The project will in particular facilitate 'long tail' re-use through its concern for long-term sustaining the digital material
- The project will have potential for economic benefits. The 'digital economy' and 'knowledge economy' are key foundations of the UK's continued prosperity. Effective management of digital assets - effective implying sustainability and support for access and re-use in innovative ways - underlies any work in the digital economy, in whatever field (for example the cultural and creative industries, or in medical research). The development of flexible methods for digital asset management (as exemplified by this project) may thus have far-reaching consequences outside HE.
- Developers of tools for processing, linking and mashing up digital content exposed on the web. These need not be based in an academic environment - they may be in government, industry, or simply interested members of the public, and may use data from various sources. The use of content models, modular/standards-based interfaces, and in particular linked data approaches, will facilitate such mash-ups.
- This may lead in turn to social benefits among the wider community, increasing digital inclusivity. Many of these developers form part of an increasing movement to open up publicly funded data to data developers among the public who want to mash it up themselves. While in many cases this relates to government data (e.g. data.gov.uk or londondatastore), it applies to other data as well. One could imagine, for example, the Stormont Papers being mashed up with historical government statistics relating to Northern Ireland, if the information were exposed appropriately to the Web. Our approach, which is built around fine-grained access, interoperability derived from the use of common patterns, and the use of linked data technologies, will enable this sort of social computing.

Publications

10 25 50
 
Description The CMES Project has created an extensible framework for representing and managing broad categories of complex digital collections produced by the arts and humanities. This framework has been designed within the context of the Fedora Commons digital repository software (http://fedora-commons.org/), which is specifically designed for the modelling of complex objects and the relationships between them.

Within Fedora representations of physical objects are formalised as 'content models', which explicitly describe the digital objects. The user interface can then make use of these content models, by rendering data objects that adhere to the same content models in a consistent manner. The study found that the arts and humanities collections that were investigated could primarily be formalised within a framework consisting of five types of content model: a collection content model, which the digital object representing the offline collection will adhere to; an aggregate content model, allowing for the grouping of objects below the collection level, or even across different collections; composite content models, enabling the consistent presentation of objects that consist of more than one type of resource; resource content models, stating which files it is necessary for each data object to contain; and resource-metadata content models, enabling the inclusion of a wider variety of metadata. These five levels of content models enable the presentation of a large quantity of content to users, although to create a repository that doesn't require a separate interface for every potential format type, it is necessary to convert certain files form one format to another. Due to the data loss in the conversion process, it is also necessary to in many cases to provide access to the original files. For example, with no widely adopted 3D modelling standard within the initial collections, as well as the difficulties in presenting the files through a web browser and the difficulty of converting one 3D format to another, the primary way the files were made available was as a video.

Although the framework was found to be suitable for modelling the types of data found in the initial arts and humanities collections, there were significant technical difficulties, caused by complications with Fedora and Islandora, the front-end web-based interface for Fedora. As a complete front-end web-based solution for Fedora Commons Islandora has a huge amount of potential. Many administrative tasks relating to user interfaces such as supplying themes and customised display options are greatly simplified by making use of the Drupal content management platform while a broad range of Drupal modules can be utilised to provide further customisation with a minimal outlay of developer effort. Many technical difficulties occurred, some of which could be overcome (e.g., communication between Islandora and Fedora) whilst others couldn't (e.g., the simple search).

The final conclusion from the research was that while the concept was sound, the software continues to need a lot of development. There is also a significant amount of work needed to convert and ingest the content, and this may not be suitable for all collections.
Exploitation Route The project has not only contributed to the academic community, but will also have a broader impact in the non-academic world, both from the collections that have been made available online and the insights that have been gained during the project. This includes both individual practitioners with an interest in the contents of the collections, and institutions interested in making a complex set of material available online.

Interested practitioners may include performing arts practitioners with an interest in researching theatre design and production who would find the Adolphe Appia collection of great value, or someone interested Scottish traditional performance; whilst historians and classicists may be interested in the Stormont Papers and the Inscriptions of Aphrodisias respectively. Many individuals, as well as institutions will also have an interest in long-term sustainability of digital recordings and in modelling the structure of performances.

An increasing number of institutions responsible for managing digital material are interested in managing complex material that goes beyond the "standard" material currently found in many repositories, such as pre-prints. Instead they are interested in managing research data and outputs from digitisation projects. These institutions will include higher education, but will potentially include many non-academic bodies, such as: libraries, archives, museums and other memory organisations; media, cultural and creative industries; commercial companies carrying out scientific research; hospitals/clinics (note that digital object types and research activities that occur in the arts and humanities may have their analogues in the sciences, e.g. annotation of images/video in medical research and practice). The CMES framework and repository provide both a method and examples of how they can achieve this.

There are also benefits to research funders, who will benefit from increased visibility, sustainability and re-use of the research that they fund. The project will in particular facilitate "long tail" re-use of resources through its concern for long-term sustaining the digital material. This in turn provides economic benefits as the "digital economy" and "knowledge economy" are key foundations of the UK's continued prosperity, and effective management of digital assets underlies any work in the digital economy, in whatever field. The development of flexible methods for digital asset management exemplified in this project may thus have far-reaching consequences outside of higher education and academia.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://cmes.cerch.kcl.ac.uk/
 
Description The findings have fed into a further project funded by the Department of Agriculture and Rural Development to look at repositories and content models for fisheries data.
First Year Of Impact 2012
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections
Impact Types Economic,Policy & public services

 
Description Collaboration with Aegis Trust Rwanda to digitise the Gacaca Archives 
Organisation The Aegis Trust
Country United Kingdom 
PI Contribution This is an ongoing collaboration to provide advice, guidance, and to undertake research into the digitisation of the Rwandan Gacaca Archive. We have undertaken research on stakeholder requirements, metadata possibilities, and the structure and format of the archive. We have also provided training for capacity building in Rwanda.
Collaborator Contribution Out partners provide access to the Gacaca Archive and act as intermediaries with the Rwandan Government. They have sponsored a number of research visits for the project team.
Impact In progress
Start Year 2013