Mixed-methods Digital Oral History: Enfolding semantic web technologies and historical-interpretative analysis
Lead Research Organisation:
UNIVERSITY COLLEGE LONDON
Department Name: Information Studies
Abstract
Long before social media and the internet, oral history used representational technologies, like the tape recorder, to capture and amplify voices that would otherwise have gone unheard. Yet, despite oral history's long use of technology, its engagement with the digital turn in the humanities has thus far been limited, focussing more on recording and accessibility rather than analysis and new research possibilities. Oral historians are predominately using the digital to publish and disseminate interviews or to interrogate individual interviews according to timestamp, project-specific annotation and interview-specific keywords. The field has engaged little with research that is ongoing in other fields of the Humanities, such as History, which is showing the potential of Semantic Web technologies to open a new quality of research horizon. Based on formal languages such as RDF, RDFS or OWL, these technologies can describe the meaning and the connections among data to define concepts, persons, places, and any kind of entity and to facilitate multifaceted retrieval, reasoning, optimal data integration and knowledge reuse. It is this innovative and far-reaching research potential that this project seeks to unlock. This project will bring together experienced researchers in the UK and Germany to ask how semantic web technologies, historical-interpretative analysis and digital research methods can be mobilised and interfolded to foment a 'digital methodological- hermeneutical turn' in oral history research. Positioning the emerging sub-field of the history of Digital Humanities as an exemplary case study for this research, this project will seek to understand the impacts that digital technology is making on the production, organisation and 'peoplescapes' of Humanities knowledge (and vice versa). In doing so, it will develop and make freely available an interoperable infrastructure of interconnected entities, in the form of a Knowledge Graph, to promote shared understanding, information representation, interrogation and discovery whilst ensuring data consistency, reusability and accessibility. Moreover, this project will generate new historical knowledge and data about the processes of formation, disruption and change that have underpinned the take up of technology in the wider Humanities, leading to the field now known as Digital Humanities. Using FAIR (Findable, Accessible, Interoperable, Reusable) principles we will advocate data sharing and reuse, ensuring transparency and reliability of our project. Combining oral history as a source and process, semantic web technologies and digital methods, this project will create new knowledge, digital artefacts and hermeneutic critical reflections that have relevance right across the fields of oral history and the history of knowledge, the history of the humanities and science, information studies and computer science including semantic technologies
Organisations
Publications
| Title | MeDoraH Ontology (alpha version) |
| Description | This ontology provides a comprehensive computational model for representing and analysing oral history interviews. It includes: -A formal specification of concepts and relationships in oral history domain -Structured vocabulary for describing interview content, metadata, and temporal aspects -Semantic relationships between different elements of oral history narratives -Machine-readable definitions enabling automated reasoning and query capabilities The model is implemented using OWL (Web Ontology Language) standards and includes complete documentation and usage guidelines. It specifically addresses the challenges of capturing narrative elements, temporal relationships, and contextual information in oral history interviews." |
| Type Of Material | Computer model/algorithm |
| Year Produced | 2025 |
| Provided To Others? | Yes |
| Impact | The ontology it is the first of its kind that address issues of hermeneutics in oral history. Although still in development this first iteration of the ontology has enabled a multidisciplinary team consisting of historians, digital humanists, computer scientists and information professionals to share a common understanding of the domain of enquiry. The ontology has contributed to advancing oral history research methodology through: - Providing a standardized framework for representing oral history data - Enabling interoperability between different oral history collections - Supporting sophisticated querying and analysis of interview content - Facilitating cross-linguistic analysis of oral history narratives - Creating a foundation for applying semantic web technologies to oral history research |
| URL | https://github.com/articoder/MeDoraH_Ontology |
| Title | MeDoraH Project GitHub Repository: Oral History Analysis Tools and Datasets |
| Description | A comprehensive GitHub repository containing: Datasets derived from oral history interviews Computational models for analysing oral history content Software tools and programs for processing oral history data Documentation and methodological guidelines Code and scripts for data processing and analysis The repository follows FAIR data principles and includes detailed documentation to ensure reproducibility and reuse. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | The repository has contributed to: Enhanced reproducibility of research methods in digital oral history Knowledge sharing within the digital humanities community Standardisation of computational approaches to oral history analysis Facilitation of collaborative research through open-source tools Creation of reusable resources for future oral history projects Establishment of best practices for digital oral history research |
| URL | https://github.com/medorah/MeDoraH-Project |
| Title | 1ST Itteration - Digital Oral History Interview Archive Platform using Omeka-S |
| Description | A web-based platform built using Omeka-S for showcasing and managing oral history interviews. The platform provides: -A user-friendly interface for accessing oral history interviews -Structured metadata management for oral history materials -Search and browse functionalities -Digital preservation capabilities -Integration with standard metadata schemas |
| Type Of Technology | Webtool/Application |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | The platform has enhanced accessibility and discoverability of oral history interviews by: -Providing standardised access to oral history materials -Enabling researchers to easily search and browse interview content -Ensuring long-term preservation of valuable oral history materials -Supporting interoperability through standardised metadata -Facilitating knowledge sharing within the digital humanities community |
| URL | https://medorah.hiddenhistories.net/ |
| Description | Voices unbound: AI and Oral History: Applications in Holocaust Testimonies |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. The talk focused on the application of Automatic Speech Recognition (ASR) technology, specifically OpenAI's Whisper model, to transcribe oral testimonies from the Holocaust. We will discuss what makes ASR for oral Holocaust testimonies challenging and present examples of successes and failures. The lecture will also cover post-processing techniques for automatically generated transcripts, including Named Entity Recognition (NER). Participants will gain insights into how AI tools can support oral history research and why domain expertise is key to correcting and interpreting the results of these tools |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Applying Digital Linguistics Techniques to Oral History Collections |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. The talk highlighted the interdisciplinary potential between oral history and (corpus) linguistics. I will demonstrate how oral historians can gain insights from the application of various linguistics tools and competencies and how linguists can benefit from using oral history texts in their research as they present rich examples of authentic spoken language. Examples will be provided of studies that have bridged this interdisciplinary divide and projections will be presented as to how this synthesis may progress. Attendees will be exposed to various contemporary tools used in corpus linguistics as well as accessible and relevant oral history archives that they may wish to exploit for further research. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Oral History and the digital transformation |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. Dr Linde Apel gave a talk on the development of oral history and how always been strongly influenced by technological developments. As a result, oral historians have often been technologically savvy and open to change. In my talk, I will discuss how approaches to interviews and demands on their accessibility are changing as a result of the increasing and increasingly taken-for-granted digitization of (historical) scholarship and research processes. It is notable that, at least in the German-speaking academic community, there is a growing interest in oral history interviews among those who do not specialize in this particular field. This is largely because such interviews can now be located and utilized far more effectively in digital format. What are the implications of this for the production of oral history sources, and for the lengthy and costly process of conducting interviews? What role do the long-standing and painstakingly compiled oral history archives play in this? |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Anthracite Oral Histories and Text Mining |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. The seminar covered novel approaches to analyzing existing public oral history collections in the context of the anthracite coal region of Northeastern Pennsylvania, USA. The oral history collections from this region include memories of coal miners and their families related to coal mining, immigration, and livelihoods that remain important for understanding local history and identity. Here, we re-examine oral history transcripts using text mining approaches with R. We consider the methodological challenges and advantages of applying techniques such as sentiment tagging and domain-specific lexicons to historic texts. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Breathing Emotions & Stories A different way of navigating Oral History |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. The talk focused on the history of oral history tradition records the influence of technology both in collecting, archiving, and navigating the oral history collections. The recent years have witnessed how computational technologies have transformed the field of archival sciences further, especially tools like automatic speech recognition, and natural language processing offer distinct opportunities for transcribing and analyzing oral history (OH) interviews. However, many oral historians emphasize the loss of auditory information when speech is converted to text, highlighting the importance of subjective cues for a complete understanding of the interviewee's narrative. In this talk, we will bring in questions, challenges and ways of navigating interview collections that focuses on the personal and emotional, by looking at the potential of paralinguistic cues such as breathing, fillers, and pauses. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Curating large oral history archives with artificial intelligence |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. Chris Pandza gave a talk on Artificial intelligence (AI) presenting new opportunities for assisted oral history curation-but also new pitfalls. What could the role(s) of AI be in the process of curation? Does AI necessarily reinforce its own biases on a collection, or can it help curators measure and mitigate their own biases? How does AI interact with the ethical and epistemological frameworks that shape oral history?Using real-world examples from the Ellis Island Oral History (National Park Service), Obama Presidency Oral History (Incite at Columbia University), and The Elders Project (Incite), I will demonstrate a number of innovative AI techniques that practitioners can use to prepare oral history collections for public use. In addition, I will demonstrate how AI can be used to make a project's curation process more rigorous and inclusive. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Ethical Considerations for the Digital Reuse of Oral Histories |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. Mary Larson gave a talk on how the oral historians have spent a great deal of effort trying to capture and present fuller contexts for oral histories, with the hope that the additional information would help researchers as they make meaning from the interviews. As oral histories intersect more frequently with digital analytics--whether through the parsing of texts, incorporation into large language models, or reuse in training AI applications--we need to be aware of how those approaches might impact how we are able to understand the oral histories in question. The stripping of carefully curated context from interviews can have ethical repercussions for how oral histories are used and reused, and this talk addresses considerations of context and its addition to or subtraction from the historical record. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Lessons learnt from building a digital infrastructure for oral history |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. Oral historians are increasingly recognizing the benefits of making their interviews available as datasets and ensuring they are 'FAIR' (Findable, Accessible, Interoperable, Reusable). This practice encourages reuse by researchers, museums, community archives, and others. However, digitization also brings several key methodological debates in oral history into focus, including the necessity of transcripts, the intersubjective nature of oral history, and ethical concerns about publishing. In addition, digitization has blurred notions of 'research data' and 'heritage data', and of 'academic' and 'non-academic' (or community) oral history. In this talk, I will explore the CLARIAH Media Suite and asses how various methodological issues in oral history have complicated its use, including informed consent, transcription, thesauri and metadata standards. The CLARIAH Media Suite is designed to provide researchers with access to (cultural heritage) data in the Netherlands. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Machine Learning to Analyze Interview Qs in Holocaust Oral Histories |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. Dr Presner gave an overview of his new book, 'Ethics of the Algorithm: Digital Humanities and Holocaust Memory' (Princeton University Press, 2024), which uses a wide-range of computational methods to read and listen to Holocaust testimonies. He will, then, focus on one particular project: using language transformers to study clusters of topics in nearly 90,000 interviewer questions (across four oral history corpora). He will conclude by discussing the recent research of his "AI and Cultural Heritage" lab, showing how Large Language Models can help disambiguate unclear references and provide additional context for testimony analysis. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
| Description | Voices unbound: Usability, accessibility and interoperability of OH interviews |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. ?he talk was about OH metadata and their role in making stories widely available, accessible and interoperable. Traditional describing methods should find their way towards a more meaningful context thanks to semantic web technologies. Worldwide there have already been some interesting initiatives moving beyond traditional cataloguing and description, but how these could communicate with each other? Could ontologies play their part in making OH interviews meaningful, accessible and useful around the world? We will discuss some examples and ontological models. The role of IT and Information professionals in shaping a viable future for OH collections. |
| Year(s) Of Engagement Activity | 2024 |
| Description | Voices unbound: What are large language models doing? |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A lecture series co-organised by TU Darmstadt, University College London, Luxembourg Centre for Contemporary and Digital History (C²DH) and the Max-Planck-Institut für Wissenschaftsgeschichte. The seminar offers an important way of keeping up to date with the methodological and theoretical state of the art in digital oral history. We invited speakers to present work on recent technological developments that may hold promise for digital oral history. In this way, the seminar series appeals to (digital) oral historians, digital humanists and scholars of the history of information, memory and knowledge systems. The talk was about large language models arguing that their outputs are best characterized as bullshit in the sense of Frankfurt (1985). I'll discuss some recent developments in generative AI, and argue that even in cases in which generative systems are not strictly speaking bullshitting, there is a mismatch between their creations and the expressive role of similar outputs by humans. I'll conclude with some discussion of anthropomorphising metaphors in science and technology communication. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://www.c2dh.uni.lu/events/voices-unbound-lecture-series-digital-oral-history |
