The Bentham Papers Transcription Initiative

Lead Research Organisation: University College London
Department Name: Bentham Project

Abstract

The Bentham Papers deposited in UCL Library consist of 60,000 folios. This material has never been properly edited, most of it has never been published, and two-thirds of it remains un-transcribed. Much of its content is, therefore, unknown. The Bentham Project was established in 1959 in order to produce an authoritative edition of 'The Collected Works of Jeremy Bentham', and has to date published twenty-six volumes of a projected sixty-eight. In 2006 the Project completed an on-line database catalogue of the Bentham Papers, consisting of up to sixteen fields of information for each of the folios, including headings, dates, pagination, and titles. The database is currently used as an editing tool by the Bentham Project, allowing it, for instance, to identify all the manuscripts relevant to a particular work in a moment, rather than after several weeks of manual searching.

Part of the vision at the time of creating the database was to enhance it by linking it to transcripts and digital images of the manuscripts. The Bentham Project has transcribed around 20,000 folios, but 40,000 remain to be done. Transcription is the first and fundamental stage in the editing of Bentham's works. A pilot project has established that an interface can be created linking transcripts, digital images, and the database catalogue, and that the digital images are of such quality that they can used for the purposes of transcription. The object of the present project is to develop a web-site which will integrate these various elements in a coherent way, but which will also implement a 'crowd-sourcing' exercise, whereby members of the public will be invited to submit their own transcripts of previously unread manuscripts.

The transcription project will have a limited duration of six months. We will provide around 12,500 images of Bentham's manuscripts, amounting to around 10,000 folios. Individuals will be able to download a transcription tool, take ownership for a limited duration of images of manuscripts, and enter the text into a transcription window. There will be a series of basic rules that they will be asked to follow. Transcripts will be submitted to the Bentham Project for moderating, and once approved, made available on-line. Transcribers will be awarded a merit mark for each successful submission, both as a virtual reward, and as a means of identifying the truly dedicated, who might then be involved in some more formal way with the resource. We will establish a users forum and a means of undertaking joint transcription, so that two or more people can co-operate in producing transcripts.

The transcriptions will eventually be used by the professional researchers at the Bentham Project when preparing the material for the new critical edition. They will also form part of an 'ideas bank'. Bentham wrote on a wide variety of subjects. His ideas were of enormous historical importance, but are also of great contemporary relevance. The transcripts will be readily searchable, so that researchers, whether academics or members of the general public, who are interested in a particular subject, can discover what Bentham thought about that subject. The Bentham Project's existing transcripts, which are currently in a proprietary format, and the newly submitted transcripts, will be encoded according to the protocol of the Text Encoding Initiative, thus guaranteeing the sustainability and transferability of the resource.

At the end of the project, a study will be produced, using both quantitative and qualitative data, on the way in which the database was used, and on the lessons to be drawn from it. A generic transcription tool will be made available for other humanities digital research projects to incorporate into their own web-sites.

Planned Impact

One of the key elements of this proposal is the interactive transcription tool. The very point is to engage the general public in the process of transcribing manuscripts, and hence stimulate greater interest in the life and thought of Jeremy Bentham. There is already significant passive interest in Bentham - many people, for instance, have heard about Bentham's auto-icon (it was mentioned, for instance, in a 'Guardian' editorial 'In praise of University College London' on 10 October 2009, and features as a main attraction in the Ripley's Believe It Or Not Museum at Piccadilly Circus), and many are aware of the panopticon prison scheme. Bentham is taught on Religious Studies, Philosophy, and History courses in sixth-forms throughout the UK (we occasionally give lectures to school groups and we recently hosted two sixth-form work placement students). Our experience with visitors to the Bentham Project is that they are fascinated by Bentham's manuscripts, and are often keen to 'have a go' at reading them. We have received strong endorsement from a School's Extended Services Manager, responsible for 27 schools in Bedfordshire, and from a humanities teacher in a sixth-form college in Wigan, Greater Manchester, who believe the resource will be attractive both to sixth-form teachers and students alike, as well as to Gifted and Talented children lower down the school.

We do, therefore, have a strong expectation that there will be a significantly large pool of people who will be interested in transcribing Bentham manuscripts. The transcription tool will be kept straightforward, instructions will be clear, and the digital images of high quality, and all will be presented in an attractive and intuitive interface. We will set up a users forum, and permit joint transcription.Transcripts will be submitted to the Bentham Project, where they will be moderated, and the transcriber will receive a merit mark for each acceptable transcript (using reasonably generous criteria as to what is acceptable).

We intend that the resource should constitute an 'ideas bank', and should be used by a wide variety of persons who are interested in a wide variety of subject areas. Ideas do not suddenly appear in the mind from nowhere. One of our main sources of ideas are the great thinkers of the past, themselves drawing on traditions of thought, and sometimes inventing new ones of their own. By drawing on the wealth of ideas which the great thinkers of the past have bequeathed us, we can clarify our own thoughts, and often be pointed towards issues which had not even occurred to us. Bentham is particularly valuable in this respect, since his influence on our society has been profound, and he still has much to teach. We have a massive archive of his papers which has only been partially explored, and much of what has been explored remains in relatively inaccessible transcripts in the Bentham Project archive. Hence a key element of this proposal is to render the transcripts of the Bentham Papers searchable in a way that will be helpful to those who are in search of ideas.

The main target groups for the 'ideas bank' are policy makers (for instance think tanks) and the media. We are often asked, 'What would Bentham have said about ....?' The best response is, 'Well, here is what he did say about it.' Hence, when issues surrounding surveillance are debated, policy makers, for instance, might look to see what Bentham said about the panopticon prison; when issues surrounding openness in government, or corruption in public life, are being debated, they might look to Bentham's writings on representative democracy; and when issues concerning legal reform are being debated, they might look to Bentham's writings on codification.

The two elements of impact described here are intimately linked, in that the greater the amount of transcription undertaken, the more in

Publications

10 25 50
publication icon
Schofield P (2015) Jeremy Bentham and the computer age: reflections on crowdsourcing the transcription of handwritten documents in Annual Bulletin of Resources and Historical Collections Office (shiryo-shitsu), The Library of Economics, The University of Tokyo

publication icon
Causer T (2014) Crowdsourcing Bentham: Beyond the Traditional Boundaries of Academic History in International Journal of Humanities and Arts Computing

publication icon
Moyle M (2011) Manuscript Transcription by Crowdsourcing: Transcribe Bentham in LIBER Quarterly: The Journal of the Association of European Research Libraries

publication icon
Tieberghien E (2016) Mapping the Bentham Corpus

 
Title A film by UCL Media Services about Transcribe Bentham, & how it fits into the editorial work of the Bentham project. 
Type Of Art Film/Video/Animation 
 
Description (READ) - Recognition and Enrichment of Archival Documents
Amount € 8,220,716 (EUR)
Funding ID 674943 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2016 
End 06/2019
 
Description (tranScriptorium) - tranScriptorium
Amount € 3,005,570 (EUR)
Funding ID 600707 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2013 
End 12/2015
 
Description A Collaborative Project Between Faculty and Students at the University of Toronto and UCL Using Handwritten Text Recognition Technology and Topic Modelling
Amount £16,668 (GBP)
Organisation University College London 
Sector Academic/University
Country United Kingdom
Start 08/2019 
End 03/2021
 
Description The Consolidated Bentham Papers Repository
Amount £339,000 (GBP)
Organisation Andrew W. Mellon Foundation 
Sector Private
Country United States
Start 10/2012 
End 09/2014
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Democritus University of Thrace
Country Greece 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Direction de la Sécurité et de la Justice
Country Switzerland 
Sector Public 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation NAVER LABS Europe
Country France 
Sector Public 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation National Centre for Scientific Research (NCSR) Demokritos
Country Greece 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Polytechnic University of Valencia
Country Spain 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Swiss Federal Institute of Technology in Lausanne (EPFL)
Country Switzerland 
Sector Public 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation University of Edinburgh
Country United Kingdom 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation University of Innsbruck
Country Austria 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation University of Leipzig
Country Germany 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation University of London
Country United Kingdom 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation University of Rostock
Country Germany 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Vienna University of Technology
Country Austria 
Sector Academic/University 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description Retrieval and Enrichment of Archival Documents (READ) 
Organisation Xerox Corporation
Department Xerox Research Centre Europe - XRCE
Country France 
Sector Private 
PI Contribution The Bentham Project has made a key contribution to the development and adoption of Handwritten Text Recognition (HTR) and other technologies which are transforming, in a way that would have been barely imaginable a decade ago, how the public and researchers access holdings of archival collections. This is being achieved through the accurate transcription by computers of historical documents written in a variety of languages and scripts. Critical to developing HTR, and the associated technologies incorporated into Transkribus, were digital images and transcripts of the Bentham Papers. Together, this material provided a central test case for computer scientists to ensure the technology was sufficiently robust to contend with a variety of problems. • The Bentham Papers contain numerous features that an effective HTR platform needs to solve, such as difficult handwriting, pages written in different hands (e.g. copyists and correspondents), headings, marginalia, faint pencil writing, skewed writing, crossings-out, interlineations, and occasional use of Latin, Greek, and French. As computer scientist Professor Enrique Vidal noted that if HTR could deal with the Bentham Papers, then it could deal with almost anything else. • Transcripts produced by the Bentham Project were used to generate 'ground truth'-that is, a precise transcript by which to train HTR models to read eighteenth and nineteenth century English handwriting. Early experiments resulted in a model with a Character Error Rate (CER) of around 18%, that is 82% of the characters on a fairly straightforward Bentham manuscript were correctly recognised. By the end of the READ programme subsequent experiments produced models capable of achieving a CER of well below 5% on straightforward Bentham manuscripts, and a CER of 9% on the most complex. • The Bentham ground truth was made available to computer scientists for use in research competitions linked to the 2014 International Conference on Frontiers of Handwriting Recognition and the 2015 International Conference on Document Analysis and Recognition. • The HTR models produced using Bentham ground truth are freely available in Transkribus as off-the-shelf English language models for others to use and re-use. • The Bentham Project worked with colleagues at the Universitat Politècnica de València to produce the 'Bentham Papers Indexing and Search' engine. Based on pattern recognition, the engine allows the user to search around 100,000 pages of the 'iconic' Bentham manuscripts collection without the need for transcription-a proof-of-concept of a potentially transformative technology for further widening access to historic manuscripts.
Collaborator Contribution The Retrieval and Enrichment of Archival Documents (READ), research team, funded by the European Commission's Horizon2020 programme from 2016-19, was a pan-European consortia consisted of computer scientists, computational linguists, archives and information professionals, and humanities scholars, which carried out fundamental research in the indexing, searching, and full transcription by computers of handwritten historic manuscripts, using Handwritten Text Recognition (HTR), Document Image Analysis, and Keyword Spotting technologies. The tools, once developed, were made widely available to the public, scholars, and research institutions through the Transkribus client, now a standard tool for the automated transcription of handwritten material with tens of thousands of registered users. Transkribus is now supported and further developed by the READ COOP, a non-profit organisation whose subscribers include the British Library, and the respective national archives of Finland, Luxembourg, Norway, and Sweden. The work of the READ team was given the final (highest) rating of 'outstanding' by the European Commission's assessment panel. Such was the success of the READ project that it received one of the five Horizon Impact Awards for 2020 from the Commission, out of a field of 225 applicants across all disciplines.
Impact Transkribus https://readcoop.eu/transkribus/
Start Year 2016
 
Description tranScriptorium 
Organisation Polytechnic University of Valencia
Country Spain 
Sector Academic/University 
PI Contribution tranScriptorium is a STREP of the Seventh Framework Programme in the ICT for Learning and Access to Cultural Resources challenge. tranScriptorium is planned to last from 1 January 2013 to 31 December 2015. tranScriptorium aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology. tranScriptorium will turn HTR into a mature technology by addressing the following objectives: - Enhancing HTR technology for efficient transcription - Departing from state-of-the-art HTR approaches, tranScriptorium will capitalize on interactive-predictive techniques for effective and user-friendly computer-assisted transcrition. - Bringing the HTR technology to users Expected users of the HTR technology belong mainly to two groups: a) individual reserachers with experience in handwritten documents transcription interested in transcribing specific documents. b) volunteers which collaborate in large transcription projects. The HTR technology will support the digitization of the handwritten materials. The outcomes of the tranScriptorium tools will be attached to the published handwritten document images. This includes not only full, correct transcriptions, but also partially correct transcription and other kinds of automatically produced metadata, useful for indexing and searching.
Start Year 2013
 
Description 'Transcribe Bentham': Presentation to the Digital Communities winner's forum 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation to attendees of the Digital Communties category winners' forum, Ars Electronica festival. Brucknerhaus, Linz, 4 September 2011.
Year(s) Of Engagement Activity 2011
 
Description Crowdsourcing: Utilizing the Power of the Many in Research 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience
Results and Impact Dr Causer was invited to take part in a panel on crowdsourcing, to an international audience of market researchers, to discuss ways in which different organisation have implemented crowdsourcing in what they do.



The other panel members were:



- Benita Matofska: Chief Sharer, People Who Share

- Phil Geraghty: Managing Director, PeopleFund.it

- Heidi Schneigansz: Idea Bounty
Year(s) Of Engagement Activity 2013
 
Description Hacking the Past: An Archives Game Jam 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Around 50 attendees attended this event, organised in conjunction with The National Archives, to create 'games with a purpose', that is to encourage the transcription of archival material through games.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/hacking-the-past-an-archives-game-jam-tickets-53954846398#
 
Description The Bentham Hackathon, 22-23 October 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Around 60 to 70 participants registered to take part in the 'Bentham Hackathon', organised in conjunction with UCL Centre for Digital Humanities, UCL Innovation and Enterprise, and IBM, to explore how to use digital tools to explore Bentham's life and work.
Year(s) Of Engagement Activity 2017
URL https://blogs.ucl.ac.uk/transcribe-bentham/2017/10/24/project-update-bentham-hackathon-weekend/