Crowd-Sourcing Scoping Study

Lead Research Organisation: King's College London
Department Name: Information Services and Systems

Abstract

Enthusiasts, connoisseurs and interested amateurs can and do offer much knowledge to the arts and humanities, but we understand little about the communities they form and come from. This review will focus on crowd-sourcing, an existing technique for gathering information from large, distributed user groups, which is ripe to be expanded and developed to meet humanities research needs. What they can these communities offer, what relationships do they form with the academy, and how do aspects of community motivate them to contribute their knowledge. The review will be inquisitive - looking at the nature of those communities - and proactive, working with Connected Communities programme to identify the most productive ways for the AHRC to engage with them. At the heart of this is crowd-sourcing. First coined in 2006, the term itself is not unproblematic: crowd-sourcing has been described as a business model in which a company's consumer group participates in product or service design, and as the harvesting of large amounts of user generated content from social media sites (see Huberman et. al. 2010, 'Crowd-sourcing, attention and productivity', JIS 2009, vol 36:6, 758-765). The questions it provokes for the arts and humanities are numerous: how do communities form around a shared interest in an academic subject, what is the role played by a shared capacity to contribute knowledge to it; what benefits accrue from this process and to whom; which parts of the AHRC's domain stand to gain most by engaging with such 'contributing communities'; how and why such these come to exist, and how they can be identified and reached by the Connected Communities programme; and - not least - what are the main technical challenges facing humanists and arts researchers who wish to draw on the expertise of geographically distributed communities. In recent years the HE sector has responded to these questions obliquely, by adopting crowd-sourcing models. However, few if any of these questions have been addressed systematically (until now). Most academic research projects using crowd-sourcing have largely maintained the approach implied by the word 'crowd', and have been oriented towards the volume of participation, and the advancement of knowledge that a large community of volunteers can make only by virtue of its numbers (examples include GalaxyZoo and Distributed Proofreaders). This review seeks to be the beginning of a process of exploration in to how crowd-sourcing models can be adapted, re-purposed and reformed in order to become useful vehicles for the creation of academic knowledge in the arts and humanities.

The review incorporates a number of information-gathering strands: desk research in to the literature produced by crowd-sourcing projects in the science and social sciences, and focusing in greater depth on those early adopters in the humanities and arts; a survey of contributors to crowd-sourcing projects exploring why they contribute data and what they get out of it; and in-depth interviews with a selected sample of these. This will be complemented by on-line and face to face networking activities in the form of two workshops and an interactive website.

The outcomes will be an overview of crowd-sourcing's current application in the arts and humanities, the background of that application in the sciences and social sciences; an assessment of where, in the arts and humanities, the most fruitful possibilities lie for these applications in the future, and a typology of crowd-sourcing methods which will offer the Connected Communities programme a framework for its future thinking in this area.

Planned Impact

Crowd-sourcing is an emerging methodology in academic research, but yet by its very nature it closely bound up with the academic impact agenda. In order to work, any academic crowd-sourcing project *must* have an accessible public interface, an effective communications strategy, and some formal relationship with its community of contributors. By gathering and articulating the experiences of academics who have done this, or who are motivated to want to do it, we will addressing a fundamental issue of how best researchers in universities can engage non-expert or semi-expert communities outside universities. This will feed directly into the AHRC's thinking via our engagement with the Connected Communities programme. The main categories of stakeholders who will benefit from this review are as follows:

Funding bodies: The Connected Communities programme provides an unparalleled opportunity for the AHRC to engage with, and lead, the crowd-sourcing agenda as it applies to the arts and humanities. This is necessary for the AHRC's own aims. In the past ten years the AHRC has funded numerous digital resource creation activities via programmes specifically focused on digitisation, and in the course of research activities, and these represent a significant investment on the part of the funding bodies. Many academics involved in these resources' creation are looking for new ways to develop and refine their digital collections. Moreover, digital research in these areas, it has been noted, frequently conforms to the 'long tail' model. A resource may have niche or limited interest in the short-term, but over the longer term larger and more diverse communities may access and use the information. The review will investigate methods for revitalising interest in existing digital resources via crowd-sourcing models, and how those models can help enhance and sustain humanities digital resources. The project outputs will contribute to a developing body of work being produced by funding bodies on methods for promoting the use of digital resources over time.

Social and economic benefits: It is frequently observed within both the government and business communities that the "digital economy" and the "knowledge economy" are key foundations of the UK's continued prosperity, and that digital inclusivity is crucial if this prosperity is to benefit society as a whole. Indeed, the recent Digital Britain report highlights the need to maximise the social and economic benefits from digital technologies. This review will enable a symbiosis to be developed between academic knowledge, economic benefit and communal and personal interest.
Development of international cultural and economic ties: Given that crowd-sourcing is fundamentally an internet based phenomenon, to follows that its application can, potentially, cross national boundaries. Formalizing crowd-sourcing methods for the humanities and arts, will, potentially, make it easier for scholars to build community-facing projects across different cultural and linguistic zones. It is also likely that the review's outcomes will have much to contribute to future partnerships that the AHRC might wish to build with overseas funding bodies.

Students, educators and the broader public: The methodology and technological approach addressed by the review will by its very nature support engagement with the public and facilitate collaboration between the public and researchers. It will highlight the fact that crowd-sourcing is scalable. The most well-known current applications are limited, in the main, to large and sophisticated projects, which require significant start-up funding, overheads and expertise. Pinning down and categorizing the methods in question, as this review proposes, will bring engagement with crowd-sourcing communities within reach of small projects with specific aims, PhD projects, and even undergraduate teaching programmes.

Publications

10 25 50
 
Description This project sought to establish a credible definition for, and the current state of the art of, crowd-sourcing in the humanities. The questions included what the humanities have learned from other research domains, where crowd-sourcing is being exploited, what the results are, why academics are motivated to undertake such activities, and why members of the public are willing to give up their time, effort and knowledge for free. We conducted a survey, supplemented by a set of follow-up interviews, of contributors' motivations, which received 59 detailed responses with qualitative and quantitative information about why people contribute to humanities. The project identified and reviewed 54 academic publications of direct relevance to the field, and a further 51 individual projects, activities and websites which document or present some application of humanities scholarship making use of crowd-sourcing. Two workshops were held, one for academics making use of crowd-sourcing, and one for contributors to those projects. Academics in the humanities undertake crowd-sourcing projects for a variety of reasons: to digitize content, to create or process content, to provide editorial or processing interventions, and so on. Judging the current value of crowd-sourcing in the humanities is therefore extremely difficult, even before issues of trust, reliability and academic rigour are accounted for. However, one common factor is that humanities crowd-sourcing succeeds where vibrant and interacting communities of contributors are created. Whilst the motivations of crowd-sourcing contributors are every bit as diverse as those of academics, passion for the subject (a characteristic shared with academics) is the dominant factor that draws them together into communities. These communities develop and perpetuate internal dynamics, self-correct, provide mutual support, and form their own relationships with the academic world. Despite the great diversity of humanities crowd-sourcing, it is possible to observe patterns in which such communities thrive: these patterns are dependent on the correct combinations of asset type (the content or data forming the subject of the activity), process type (what is done with that content), task type (how it is done), and the output type (the thing produced) desired. In this report, we propose a high-level typology that describes different instances of each of these, and identifies the combinations that are, on present evidence, most successful in achieving projects' aims.
Exploitation Route The framework may be used for the analysis of crowd-sourcing projects, as well as the planning of new ones. We would also anticipate that the model could be developed further, or modified in the light of additional research.
Sectors Creative Economy,Government, Democracy and Justice,Culture, Heritage, Museums and Collections

URL http://humanitiescrowds.org/wp-content/uploads/2012/12/Crowdsourcing-connected-communities.pdf
 
Description The analysis carried out by the project has influenced crowdsourcing activities in the cultural sector (both public and private).
First Year Of Impact 2012
Sector Creative Economy,Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description PARTHENOS - H2020-INFRADEV-1-2014-1
Amount € 363,053 (EUR)
Funding ID 654119 
Organisation European Commission 
Department Horizon 2020
Sector Public
Country European Union (EU)
Start 05/2015 
End 04/2019
 
Description Collaboration with University of Lincoln 
Organisation University of Lincoln
Country United Kingdom 
Sector Academic/University 
PI Contribution Humanities crowdsourcing research in relation to a series of community archaeology activities.
Collaborator Contribution Project starts in 2019 - none as yet.
Impact Project starts in 2019 - none as yet.
Start Year 2019
 
Description Research collaboration with Stanford University (Center for Spatial and Textual Analysis - CESTA) 
Organisation Stanford University
Country United States 
Sector Academic/University 
PI Contribution Jointly organised international symposium on humanities crowdsourcing at King's College London (September 2015) Staff research visits from King's College London to Stanford University/CESTA. Jointly organised collaborative panel session at IEEE Big Data 2015 in California (October 2015). Jointly organised symposium on humanities crowdsourcing at Stanford University (October 2017).
Collaborator Contribution Jointly organised international symposium on humanities crowdsourcing at King's College London (September 2015) Hosted research visits of KCL academic staff to Stanford University/CESTA. Jointly organised collaborative panel session at IEEE Big Data 2015 in California (October 2015). Jointly organised symposium on humanities crowdsourcing at Stanford University (October 2017).
Impact International symposium on humanities crowdsourcing at King's College London (September 2015) Collaborative panel session at IEEE Big Data 2015 in California (October 2015). International symposium on humanities crowdsourcing at Stanford University (October 2017).
Start Year 2013
 
Description Research collaboration with University of Maryland (Digital Curation Innovation Center - DCIC) 
Organisation University of Maryland, College Park
Country United States 
Sector Academic/University 
PI Contribution Co-organised series of workshops at IEEE Big Data conferences (2013, 2014, 2015) Co-organised session on crowdsourcing at ARCHIVES 2013 (the 2013 annual conference of the Society of American Archivists) Co-wrote funding applications.
Collaborator Contribution Co-organised series of workshops at IEEE Big Data conferences (2013, 2014, 2015) Co-organised session on crowdsourcing at ARCHIVES 2013 (the 2013 annual conference of the Society of American Archivists) Co-wrote funding applications.
Impact Series of workshops at IEEE Big Data conferences (2013, 2014, 2015) Session at ARCHIVES 2013, New Orleans
Start Year 2012
 
Description Citizen Humanities Comes of Age: Crowdsourcing for the Humanities in the 21st Century 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Research in the humanities was once the preserve of an academic and professional elite, conducted in universities, libraries, museums and archives, with clear criteria for belonging to the communities undertaking it. In recent years however, science and business, which shared this culture of exclusivity with the humanities, has found these boundaries challenged through crowdsourcing, and have flourished as a result. This collaborative and interdisciplinary symposium, organised jointly by King's College London's Department of Digital Humanities (DDH) and Stanford University's Center for Spatial and Textual Analysis (CESTA), sought to explore the ways in which humanities and cultural heritage research is enriched through scholarly crowdsourcing. It brought together the unique perspectives on the subject that DDH and CESTA have developed over the past three years, including DDH's Crowd-Sourcing Scoping Study funded by the AHRC, and Stanford's Humanities Crowdsourcing research theme. The meeting explored the arc between the inception of humanities crowd-sourcing as a method of data processing adopted largely uncritically from big science, to its present instance as a means of interrogating fuzzy and disparate humanities research data in new ways using 'non-professional' engagement and input, and to future possibilities involving completely new ways of co-producing humanities research across increasingly blurred institutional and professional boundaries.
Year(s) Of Engagement Activity 2015
URL https://connected-communities.org/index.php/news/citizen-humanities-comes-of-age-crowdsourcing-for-t...
 
Description Crowdsourcing: participatory digital research projects (at CRASSH, Cambridge) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Talk resulted in discussions during the workshop.

N/A
Year(s) Of Engagement Activity 2013
URL http://www.crassh.cam.ac.uk/events/2515/
 
Description Digital Impacts: Crowdsourcing in the Arts and Humanities (at Oxford Internet Institute) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Discussions at and after workshop.

N/A
Year(s) Of Engagement Activity 2013
URL http://www.oii.ox.ac.uk/events/?id=573
 
Description More than a business model: crowd-sourcing and impact in the humanities 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact N/A
Year(s) Of Engagement Activity 2013
URL http://blogs.lse.ac.uk/impactofsocialsciences/2013/03/21/more-than-a-business-model-crowd-sourcing-a...
 
Description Project workshop with organisers of crowdsourcing projects 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact For this workshop we invited organisers of crowdsourcing projects (academics and others, e.g. from cultural and memory institutions), with the aim of identifying the main questions and areas of interest being addressed by crowdsourcing projects in the humanities and cultural heritage, as well as investigating the processes through which the public are engaged by such projects.

N/A
Year(s) Of Engagement Activity 2012
URL http://humanitiescrowds.org/wp-content/uploads/2012/09/workshop_report1.pdf
 
Description Project workshop with participants in crowdsourcing projects (i.e. members of the public) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact We gained information about the motivations and activities of participants in crowdsourcing projects.

N/A
Year(s) Of Engagement Activity 2012