The Bentham Papers Transcription Initiative

Lead Research Organisation: UNIVERSITY COLLEGE LONDON

Department Name: Bentham Project

Abstract

The Bentham Papers deposited in UCL Library consist of 60,000 folios. This material has never been properly edited, most of it has never been published, and two-thirds of it remains un-transcribed. Much of its content is, therefore, unknown. The Bentham Project was established in 1959 in order to produce an authoritative edition of 'The Collected Works of Jeremy Bentham', and has to date published twenty-six volumes of a projected sixty-eight. In 2006 the Project completed an on-line database catalogue of the Bentham Papers, consisting of up to sixteen fields of information for each of the folios, including headings, dates, pagination, and titles. The database is currently used as an editing tool by the Bentham Project, allowing it, for instance, to identify all the manuscripts relevant to a particular work in a moment, rather than after several weeks of manual searching.

Part of the vision at the time of creating the database was to enhance it by linking it to transcripts and digital images of the manuscripts. The Bentham Project has transcribed around 20,000 folios, but 40,000 remain to be done. Transcription is the first and fundamental stage in the editing of Bentham's works. A pilot project has established that an interface can be created linking transcripts, digital images, and the database catalogue, and that the digital images are of such quality that they can used for the purposes of transcription. The object of the present project is to develop a web-site which will integrate these various elements in a coherent way, but which will also implement a 'crowd-sourcing' exercise, whereby members of the public will be invited to submit their own transcripts of previously unread manuscripts.

The transcription project will have a limited duration of six months. We will provide around 12,500 images of Bentham's manuscripts, amounting to around 10,000 folios. Individuals will be able to download a transcription tool, take ownership for a limited duration of images of manuscripts, and enter the text into a transcription window. There will be a series of basic rules that they will be asked to follow. Transcripts will be submitted to the Bentham Project for moderating, and once approved, made available on-line. Transcribers will be awarded a merit mark for each successful submission, both as a virtual reward, and as a means of identifying the truly dedicated, who might then be involved in some more formal way with the resource. We will establish a users forum and a means of undertaking joint transcription, so that two or more people can co-operate in producing transcripts.

The transcriptions will eventually be used by the professional researchers at the Bentham Project when preparing the material for the new critical edition. They will also form part of an 'ideas bank'. Bentham wrote on a wide variety of subjects. His ideas were of enormous historical importance, but are also of great contemporary relevance. The transcripts will be readily searchable, so that researchers, whether academics or members of the general public, who are interested in a particular subject, can discover what Bentham thought about that subject. The Bentham Project's existing transcripts, which are currently in a proprietary format, and the newly submitted transcripts, will be encoded according to the protocol of the Text Encoding Initiative, thus guaranteeing the sustainability and transferability of the resource.

At the end of the project, a study will be produced, using both quantitative and qualitative data, on the way in which the database was used, and on the lessons to be drawn from it. A generic transcription tool will be made available for other humanities digital research projects to incorporate into their own web-sites.

Planned Impact

One of the key elements of this proposal is the interactive transcription tool. The very point is to engage the general public in the process of transcribing manuscripts, and hence stimulate greater interest in the life and thought of Jeremy Bentham. There is already significant passive interest in Bentham - many people, for instance, have heard about Bentham's auto-icon (it was mentioned, for instance, in a 'Guardian' editorial 'In praise of University College London' on 10 October 2009, and features as a main attraction in the Ripley's Believe It Or Not Museum at Piccadilly Circus), and many are aware of the panopticon prison scheme. Bentham is taught on Religious Studies, Philosophy, and History courses in sixth-forms throughout the UK (we occasionally give lectures to school groups and we recently hosted two sixth-form work placement students). Our experience with visitors to the Bentham Project is that they are fascinated by Bentham's manuscripts, and are often keen to 'have a go' at reading them. We have received strong endorsement from a School's Extended Services Manager, responsible for 27 schools in Bedfordshire, and from a humanities teacher in a sixth-form college in Wigan, Greater Manchester, who believe the resource will be attractive both to sixth-form teachers and students alike, as well as to Gifted and Talented children lower down the school.

We do, therefore, have a strong expectation that there will be a significantly large pool of people who will be interested in transcribing Bentham manuscripts. The transcription tool will be kept straightforward, instructions will be clear, and the digital images of high quality, and all will be presented in an attractive and intuitive interface. We will set up a users forum, and permit joint transcription.Transcripts will be submitted to the Bentham Project, where they will be moderated, and the transcriber will receive a merit mark for each acceptable transcript (using reasonably generous criteria as to what is acceptable).

We intend that the resource should constitute an 'ideas bank', and should be used by a wide variety of persons who are interested in a wide variety of subject areas. Ideas do not suddenly appear in the mind from nowhere. One of our main sources of ideas are the great thinkers of the past, themselves drawing on traditions of thought, and sometimes inventing new ones of their own. By drawing on the wealth of ideas which the great thinkers of the past have bequeathed us, we can clarify our own thoughts, and often be pointed towards issues which had not even occurred to us. Bentham is particularly valuable in this respect, since his influence on our society has been profound, and he still has much to teach. We have a massive archive of his papers which has only been partially explored, and much of what has been explored remains in relatively inaccessible transcripts in the Bentham Project archive. Hence a key element of this proposal is to render the transcripts of the Bentham Papers searchable in a way that will be helpful to those who are in search of ideas.

The main target groups for the 'ideas bank' are policy makers (for instance think tanks) and the media. We are often asked, 'What would Bentham have said about ....?' The best response is, 'Well, here is what he did say about it.' Hence, when issues surrounding surveillance are debated, policy makers, for instance, might look to see what Bentham said about the panopticon prison; when issues surrounding openness in government, or corruption in public life, are being debated, they might look to Bentham's writings on representative democracy; and when issues concerning legal reform are being debated, they might look to Bentham's writings on codification.

The two elements of impact described here are intimately linked, in that the greater the amount of transcription undertaken, the more in

Funded Value:

£262,673

Funded Period:

Mar 10 - Apr 11

Funder:

AHRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

AH/H037233/1

Principal Investigator:

Thomas Schofield

Research Subject:

Law & legal studies (33%)

Philosophy (66%)

Research Topic:

History Of Ideas (33%)

History Of Philosophy (33%)

Jurisprudence/Legal Philosophy (33%)

Organisations

People	ORCID iD
Thomas Schofield (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Causer T (2014) Crowdsourcing Bentham: Beyond the Traditional Boundaries of Academic History in International Journal of Humanities and Arts Computing

Causer T (2012) Building a Volunteer Community: Results and Findings from 'Transcribe Bentham' in Digital Humanities Quarterly

Causer T (2018) 'Making such bargain': Transcribe Bentham and the quality and cost-effectiveness of crowdsourced transcription1 in Digital Scholarship in the Humanities

Causer T (2014) Crowdsourcing Our Cultural Heritage

Causer T (2012) Transcription maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham* in Literary and Linguistic Computing

Gatos B (2014) Ground-Truth Production in the Transcriptorium Project

Moyle M (2011) Manuscript Transcription by Crowdsourcing: Transcribe Bentham in LIBER Quarterly: The Journal of the Association of European Research Libraries

Prats Lopez M (2015) Extra-Organizational Learning: Learning Beyond Organizational Boundaries

Schofield P (2015) Jeremy Bentham and the computer age: reflections on crowdsourcing the transcription of handwritten documents in Annual Bulletin of Resources and Historical Collections Office (shiryo-shitsu), The Library of Economics, The University of Tokyo

Tieberghien E (2016) Mapping the Bentham Corpus

Artistic and Creative Products
Further Funding
Collaboration
Engagement Activities


Title	A film by UCL Media Services about Transcribe Bentham, & how it fits into the editorial work of the Bentham project.
Type Of Art	Film/Video/Animation


Description	(READ) - Recognition and Enrichment of Archival Documents
Amount	€ 8,220,716 (EUR)
Funding ID	674943
Organisation	European Commission
Sector	Public
Country	Belgium
Start	01/2016
End	06/2019


Description	(tranScriptorium) - tranScriptorium
Amount	€ 3,005,570 (EUR)
Funding ID	600707
Organisation	European Commission
Sector	Public
Country	Belgium
Start	01/2013
End	12/2015


Description	A Collaborative Project Between Faculty and Students at the University of Toronto and UCL Using Handwritten Text Recognition Technology and Topic Modelling
Amount	£16,668 (GBP)
Organisation	University College London
Sector	Academic/University
Country	United Kingdom
Start	07/2019
End	03/2021


Description	The Consolidated Bentham Papers Repository
Amount	£339,000 (GBP)
Organisation	Andrew W. Mellon Foundation
Sector	Private
Country	United States
Start	09/2012
End	09/2014


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Democritus University of Thrace
Country	Greece
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Direction de la Sécurité et de la Justice
Country	Switzerland
Sector	Public
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	NAVER LABS Europe
Country	France
Sector	Public
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	National Centre for Scientific Research (NCSR) Demokritos
Country	Greece
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Polytechnic University of Valencia
Country	Spain
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Swiss Federal Institute of Technology in Lausanne (EPFL)
Country	Switzerland
Sector	Public
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	University of Edinburgh
Country	United Kingdom
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	University of Innsbruck
Country	Austria
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	University of Leipzig
Country	Germany
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	University of London
Country	United Kingdom
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	University of Rostock
Country	Germany
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Vienna University of Technology
Country	Austria
Sector	Academic/University
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	Retrieval and Enrichment of Archival Documents (READ)
Organisation	Xerox Corporation
Department	Xerox Research Centre Europe - XRCE
Country	France
Sector	Private
PI Contribution	The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution	The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact	Transkribus https://readcoop.eu/transkribus/
Start Year	2016


Description	tranScriptorium
Organisation	Polytechnic University of Valencia
Country	Spain
Sector	Academic/University
PI Contribution	tranScriptorium is a STREP of the Seventh Framework Programme in the ICT for Learning and Access to Cultural Resources challenge. tranScriptorium is planned to last from 1 January 2013 to 31 December 2015. tranScriptorium aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology. tranScriptorium will turn HTR into a mature technology by addressing the following objectives: - Enhancing HTR technology for efficient transcription - Departing from state-of-the-art HTR approaches, tranScriptorium will capitalize on interactive-predictive techniques for effective and user-friendly computer-assisted transcrition. - Bringing the HTR technology to users Expected users of the HTR technology belong mainly to two groups: a) individual reserachers with experience in handwritten documents transcription interested in transcribing specific documents. b) volunteers which collaborate in large transcription projects. The HTR technology will support the digitization of the handwritten materials. The outcomes of the tranScriptorium tools will be attached to the published handwritten document images. This includes not only full, correct transcriptions, but also partially correct transcription and other kinds of automatically produced metadata, useful for indexing and searching.
Start Year	2013


Description	'Transcribe Bentham': Presentation to the Digital Communities winner's forum
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Presentation to attendees of the Digital Communties category winners' forum, Ars Electronica festival. Brucknerhaus, Linz, 4 September 2011.
Year(s) Of Engagement Activity	2011


Description	Crowdsourcing: Utilizing the Power of the Many in Research
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience
Results and Impact	Dr Causer was invited to take part in a panel on crowdsourcing, to an international audience of market researchers, to discuss ways in which different organisation have implemented crowdsourcing in what they do. The other panel members were: - Benita Matofska: Chief Sharer, People Who Share - Phil Geraghty: Managing Director, PeopleFund.it - Heidi Schneigansz: Idea Bounty
Year(s) Of Engagement Activity	2013


Description	Hacking the Past: An Archives Game Jam
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	Around 50 attendees attended this event, organised in conjunction with The National Archives, to create 'games with a purpose', that is to encourage the transcription of archival material through games.
Year(s) Of Engagement Activity	2019
URL	https://www.eventbrite.co.uk/e/hacking-the-past-an-archives-game-jam-tickets-53954846398#


Description	The Bentham Hackathon, 22-23 October 2017
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	Around 60 to 70 participants registered to take part in the 'Bentham Hackathon', organised in conjunction with UCL Centre for Digital Humanities, UCL Innovation and Enterprise, and IBM, to explore how to use digital tools to explore Bentham's life and work.
Year(s) Of Engagement Activity	2017
URL	https://blogs.ucl.ac.uk/transcribe-bentham/2017/10/24/project-update-bentham-hackathon-weekend/

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications