Historicizing the dot.com bubble and contextualizing email archives

Lead Research Organisation: University of Bristol
Department Name: Management

Abstract

Future researchers will have to engage with emails if they are to understand the lives of those who lived in the late twentieth and early twenty-first centuries. This is particularly true of organizations and their employees, for whom email has become the default form of internal and external communication. As it currently stands, publicly available email archives are rare, and there has been minimal engagement with them as a historical resource. Indeed, one of the most well-known examples, the Enron Email Corpus, only exists because of high-profile legal proceedings that followed the firm's bankruptcy and has seen minimal historical investigation since its publication. While this is partly due to its comparative recency, the reading of emails as a historical source is a developing practice and requires particular skills and knowledge that are not traditionally associated with historical enquiry. Despite this, archives and other heritage organizations are increasingly collecting and preserving email data and we are fast moving into the period where the events of the 1990s are of historical interest. We believe that our project offers a timely opportunity to address the gap between current efforts to preserve email and the future requirements that will allow them to actually be read and engaged with.

To address this issue, we seek a better understanding of how email archives can be made more accessible for the purposes of historical learning and research. The problem we focus on here is that, while emails offer valuable insight to researchers, a lack of context often presents a challenge to those wishing to understand their content, inter-relationship and wider historical significance. This de-contextualization can represent a barrier to engagement, to both trained historians and general interest users. Furthermore, existing examples of email archives often purposefully remove personal information, further disconnecting emails from their authors, recipients and connection to related material. For these reasons, our project will make an email archive available in such a way that maintains the relational and network properties that emails hold, as these allow individual emails to be understood in terms of their connection to those that precede and follow them. Furthermore, we will bring the historical context back to otherwise de-contextualized data, allowing researchers to interpret isolated items of communication in a way that appreciates the wider historical circumstances in which they were created.

We will address this challenge through a UK-US collaboration between three universities (University of Bristol, De Montfort University, University of Maryland) and two heritage sector partners (The National Archives, UK, and Hagley Museum and Library, US). Through these collaborations, the project will focus on accessioning and re-contextualizing a worked example of an email archive from a failed US software company from the dot.com era, making it available in various forms to suit the diverse requirements of its potential readers. More specifically, the project has three overall work packages that together deliver on the project's aim and objectives. The first aspect of the project centres around work linking the constituent emails in the archive together to retain the basic network structure of the communications and making relational links to otherwise disconnected emails based on their content. This will be combined with a user interface that allows the whole archive to be searched and read. The second aspect of the project provides a historical case study of the failed US company based on its archive and will require the development of both a narrative explanation of its history and an online platform for public engagement with it. The final package focuses on the project's legacy and deals with issues of long-term preservation of the archive, description of best practice, and engagement with project stakeholders.

Planned Impact

The aim of our project is to make born-digital sources better accessible to a broader audience interested in digital transformations. For museums and archives, the digital shift has meant they are faced with growing pressure from increased user expectations. Despite the ubiquity of email within modern communication, there are relatively few email archives accessible to the public. The reasons for this include access issues due to privacy concerns, difficulty in processing, preparing and preserving network-type digital data, and the inherent commercial value of such resources to technology companies for training AI-style applications. These challenges have different implications for our two key stakeholders, heritage professionals and the wider public, which our project seeks to address.

The heritage sector - Through this project, our aim is to enhance the research capacity, knowledge and skills of public and third sector organizations. In this regard, the impact of our project will centre around the increased digital capabilities engendered by our dissemination activities and direct collaboration. Particularly, increased knowledge and skills in how to process emails for optimum researcher engagement will allow holders of such material to make them available or develop strategies for future access where the material is too recent for release. While the project focuses on an organization's email archive, these capabilities will be transferrable to other forms of email corpora, such as those of key public figures. Moreover, they will also hold relevance for newer forms of digital communications such as instant messaging, online forums, and collaborative platforms like Slack and others, which increasingly dominate the public space and organizational life. In addition to skills and resources, the archival interface we develop will have applications beyond the specific archive we apply it to, providing a resource for other heritage institutions to use in searching and reading their own collections. Similarly, the process for creating an online resource for learning will also be made publicly available, informing other interactive exhibits based on email data.

Engaging the wider public in digital history and heritage - As the dot.com boom is nearing its twentieth anniversary, we aim to use digital heritage tools to enhance cultural enrichment, presenting a history of one early software company's rise and fall. While business archives are common and maintain rich and varied material, they are not normally easily accessible to the public. Furthermore, even when collections are publicly available, they are not optimised for public engagement and can easily overwhelm potential users. In providing a basis for increased access to digital historical material in a user-appropriate form, our project will address this barrier, leveraging the rich socio-cultural detail that is embedded within email. We see this impact stemming firstly from the increased researcher-led interest in email archives, allowing email to inform the histories that are written and later engaged with by the public. More directly, the public will be able to interact with email data themselves though our digital exhibit, enriching public understanding of a historical context dominated by email communication (the dot.com boom). That is, emails hold the potential to increase public engagement with research related to societal issues, especially with a relatively recent and well-known historical phenomenon. In this, we will collaborate with our heritage partners' existing public outreach and learning programmes, as well as their established media dissemination strategies.
 
Description What were the most significant achievements from the award? To what extent were the award objectives met? If you can, briefly explain why any key objectives were not met.
- We have developed a discovery prototype for email archives (WP1), which is now available for free via GitHub.
- We have run a significant number of dissemination events online and with partners such as AURA, TNA etc. (WP3) This has been a positive outcome of COVID, and access through these online communities to practitioners appears to have been enhanced.
- We presented work at academic conferences, including roundtables at the Business History Conference, and the IEEE Big Data conference 2021 (Computational Archival Science Workshop). The IEEE contribution has also been published as a conference paper with DOI online.
- We were invited to submit a full paper discussing our project in a special issue in AI & Society, which was published January 2022 open access. (WP3)
- We are completing work on our website learning resource and hope to launch this in April (WP2)
- For an overview of the project please see our website: https://orghist.com/ahrc-project-historicizing-the-dot-com-bubble/
- We also recorded a video presentation introducing the project - see link below.
- Further recorded talks are available online, and were disseminated via social media through the OHN blog, Twitter and LinkedIn.
Exploitation Route We have built good connections with other networks and organizations of relevance to digital preservation and archives, such as AURA (https://www.aura-network.net/) and DPC (https://www.dpconline.org/) and EPADD (USA). Our discovery tool is freely available and its uses are described for a non-technical audience in the AI & Society paper, and for a more technical audience in the IEEE paper.
Sectors Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections

URL https://www.youtube.com/watch?v=sSNUrBujWSw&feature=youtu.be
 
Description - We have developed a strong network with professionals and stakeholders in the digital heritage sector. - We were regularly invited to speak at events and are the only speakers focused on access and discovery of digital archives, as opposed to preservation. We are in the process of shifting the conversation by moving professional archivists' sight beyond preservation towards the ultimate users of the collections in the future. We are working on expanding this user-focused type of enquiry in follow-on projects. Challenges overcome to achieve impact - Covid has required us to rethink dissemination events. This was very successful, and we believe our reach and network was improved as professionals interested in digital archiving developed strong online events and communities in which we were a major participant. - Our international collaboration was going to be based on close collaboration in the first six months when our US collaborator was based in the UK on research leave and was to have more free time to progress aspects of the problem, which was significantly disrupted by Covid. We overcame this through regular Monday Zoom meetings, which have provided structure and continuity to our project.
First Year Of Impact 2021
Sector Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description British Library blog reporting on our AURA presentation
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
URL https://blogs.bl.uk/digital-scholarship/2021/02/aura-research-network-second-workshop-write-up.html
 
Description Quoted in an cross-disciplinary article on applying AI to digital archives
Geographic Reach National 
Policy Influence Type Contribution to new or improved professional practice
URL https://eprints.whiterose.ac.uk/194106/
 
Description Email Archives: Building Capacity and Community
Amount $56,949 (USD)
Organisation Andrew W. Mellon Foundation 
Sector Private
Country United States
Start 04/2022 
End 12/2023
 
Title Online alpha-phase version of the EMCODIST research tool 
Description This website hosts a full (back-end and front-end) alpha-phase version of the research tool developed initially in Historicizing the dot.com bubble and contextualizing email archives. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact This online version of the tool has allowed us to share and demonstrate the tool to various records-management professionals, particularly with UK government. 
URL https://emcodist.com/
 
Title EMCODIST tool 
Description Test version of EMail COntextualisation DIScovery Tool (EMCODIST). It can be trained on an email collection and improves the findability of relevant email messages for qualitative research purposes. Freely available via GitHub. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact to be filled in later 
URL https://github.com/Contextualising-Email-Archives/discovery-tool
 
Title The Dot-Com Email Archive website 
Description Welcome to the Dot-com Archive, an interactive history of dot-com-era business. The Dot-com Archive is part of the Contextualizing Email Archives project. Drawing on a dataset of organizational emails, we present a collection of historical vignettes that provide insight into the everyday running of a dot-com start-up, AuroraTec. Each vignette deals with a different issue faced by this company, using business and management theories to better understand the events and challenges of the dot-com economy. 
Type Of Material Data analysis technique 
Year Produced 2022 
Provided To Others? Yes  
Impact This website has been used in teaching MBA students and has been presented to colleagues and through wider social media. 
URL https://dotcomarchive.bristol.ac.uk/
 
Description AOM Insight article about Academy of Management Discoveries paper on the "Morning Inbox Problem" 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact AOM Insights reports on forthcoming papers of interest in Academy of Management publications. The article, based upon an interview conducted with Profs. Kirsch and Byun, was distributed through the Academy of Management publication network.
Year(s) Of Engagement Activity 2021
URL https://journals.aom.org/doi/10.5465/amd.2018.0210.summary
 
Description AURA event January 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Around 100 heritage, archives and records management professionals attended the AURA event: AI and Archives: Current Challenges and Prospects of Digital and Born-digital archives, organized jointly by The National Archives and the British Library, where we presented our project. Further information can be found here: https://www.aura-network.net/events/ai-and-archives-current-challenges-and-prospects-of-born-digital-archives-2/
Our contribution was mentioned in a British Library blog that covered the event: https://blogs.bl.uk/digital-scholarship/2021/02/aura-research-network-second-workshop-write-up.html
We have been invited to contribute to the third event run by AURA, and another event run by archivists going forward.
Year(s) Of Engagement Activity 2021
URL https://www.aura-network.net/events/ai-and-archives-current-challenges-and-prospects-of-born-digital...
 
Description Government Knowledge Information Management Profession Bite Size talks 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Stephanie Decker presented "EMCODIST: Beyond the search window" as part of the Government Knowledge Information Management Profession Bite-Size talks on 16 June 2022. This is a training activity for information management professionals in the British government.
Year(s) Of Engagement Activity 2022
URL http://n.a.
 
Description Inter-disciplinary Roundtable 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The project team organized an inter-disciplinary roundtable involving researchers and archival professionals as part of the Business History Conference.
Year(s) Of Engagement Activity 2021
URL https://thebhc.org/2021-bhc-virtual-meeting
 
Description LUSTRE workshop 1: AI and born-digital archives: Challenges and opportunities 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Adam Nix & Stephanie Decker were invited to talk about "Finding Light in Dark Archives" as part of the research council-funded network "LUSTRE - Unlocking our Digital Past with Artificial Intelligence". They hosted their first workshop on "AI and born-digital archives: Challenges and opportunities" together with the Cabinet Office, London, on 26 January 2023. This was a hybrid event with several hundred online attendants and about 50 in-person attendants.
Year(s) Of Engagement Activity 2023
URL https://lustre-network.net/2023/01/11/workshop-1-ai-and-born-digital-archives-challenges-and-opportu...
 
Description New Search Tool Options for Discovery in Email Archives 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Future researchers will have to engage with emails if they are to understand the lives of those who lived in the late twentieth and early twenty-first centuries. Archives and other heritage organizations are increasingly collecting and preserving email and we are fast moving into the period where the events of the 1990s are of historical interest. While email can offer valuable insight to researchers, a lack of context in any individual message in an email thread often presents a challenge to those wishing to understand their content, inter-relationship and wider historical significance. This de-contextualization can represent a barrier to engagement, to both trained historians and general interest users, and makes standard keyword search tools less useful. As part of this AHRC-funded project, TNA and the University of Bristol collaborate to develop a contextual email discovery tool, currently as a prototype, that is tailored to the unique nature of email as a form of communication.
The seminar was focussed on how we conceptualised email and the challenges of developing more contextual discovery. A demo of the first iteration of this tool (which is very much at a prototype stage) was presented. This tool development is a part of the wider research project.
Year(s) Of Engagement Activity 2021
 
Description Presentation at AURA event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The purpose of this workshop was to "bring together key actors in the archive 'circuit': from creators of data, to archivists and to users (thereby crossing the boundaries between Computer Scientists and Humanities Scholars) with the aim of planning new projects on AI and Archives. The workshop focused on the ethics of AI use in archives, looked at AI techniques such as machine learning, and reflected specifically on issues in digital humanities" Dr Adam Nix presented a talk on "Finding light in dark archives: Using AI to connect context and content in email", which provided an update on progress in relation to this project and also provided some discussion around broader implications.
Year(s) Of Engagement Activity 2021
URL https://www.aura-network.net/events/artificial-intelligence-and-archives-what-comes-next/
 
Description Presentation at the Digital Archives Learning Exchange 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The National Archives (UK), Digital Archives Learning Exchange (DALE) onlnine event "Strictly on the Download: Digital Presentation in Action" on YouTube, 16 March 2021. Stephanie Decker presented insights from the AHRC-funded "Contextualising Email Archives" project.
Year(s) Of Engagement Activity 2021
URL https://www.youtube.com/watch?v=LHD9mAkzx3M
 
Description YouTube video summarising the project 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As part of our broader dissemination strategy, we created a YouTube summary of our project, aims, and current progress. This is shared along with other information about the project on social media and other channels.
Year(s) Of Engagement Activity 2020
URL https://youtu.be/sSNUrBujWSw