Twitter rape threats and the discourse of online misogyny

Lead Research Organisation: Lancaster University
Department Name: Linguistics and English Language


In July 2013, after Caroline Criado-Perez successfully campaigned to have a woman appear on an English banknote, she was inundated with misogynistic abuse on Twitter involving graphic, sadistic, and repeated threats of rape and murder. When MP Stella Creasy defended Criado-Perez, she received similar abuse, and in the following days this escalated into bomb threats sent to several public figures such as academic Mary Beard and celebrity Coleen Nolan.

There is a remarkable lack of research into such behaviour making an evidence based approach to it problematic for investigative bodies, parliamentarians, and academics. Despite this, policymakers and legislators are under intense pressure to make quick, long-term decisions on relevant policy and procedure. Indeed, the Department for Culture, Media & Sport Committee (DCMS) Select Committee has already announced an autumn inquiry into harmful online content.

With the above in mind this project will provide urgently-needed insight into those who send extreme online misogynistic threats. Specifically, we will investigate what their language reveals about (1) their concerns, interests, and ideologies; what concept do they seem to have of themselves and their role in society? (2) their motivations and goals; what seems to trigger them? What do they seem to be seeking? and (3) the links between them and other individuals, topics, and behaviours; do they only produce misogynistic threats or do they engage in other hate-speech? Do they act alone or within networks?

To tackle such questions we will start by exploring a seed corpus of nearly 300 such tweets, already given to the project by Criado-Perez and Creasy, for links to related accounts, tweets, topics, and so forth. This content will be downloaded into a large, rich, finely-structured, multi-layered corpus of abuse containing content and accounts that clearly link back to the original abusive tweets. Whilst it would be possible to analyse the seed corpus alone, corpus-based methods harness computing power to enable the rapid, reliable analyses of massive linguistic datasets. Using this approach allows us to undertake a more general analysis of misogynistic hate-speech (i.e. as directed at non-high-profile figures) rather than simply focus purely upon the high-profile cases currently driving debate in this area. By doing this, we open up the possibility of revealing large-scale trends that would be hidden, if not entirely absent, from small datasets like the seed corpus.

This project aims to advance research into misogynistic online threats in a sufficiently timely manner to engage with public concern, inform policy (e.g. the DCMS inquiry), and advise legislators. Our first task then will be to explore the questions: (1) "What can we learn about the rape threat trolls?" and (2) "Why do they make rape threats?" Since we anticipate that these will be of greatest immediate value to our non-academic research users, we plan on largely completing them in the first third of the project's timespan. We will then expand the project to answer objective (3) "What else do rape threat trolls do?" with a view to identifying whether they also engage in more general online hate-speech. Throughout this project, we will also address two methodological objectives: (4) "What is the most appropriate structure for our corpus?" and (5) "What are the most appropriate methods (from the suite of corpus-based methods) for interrogating our corpus?"

In summary, this project will be relevant to several social sciences including sociology, criminology, politics, psychology, and law. It also offers timely insight in an area where policy, practice, and legislation is currently under intense scrutiny and requires such research to help shape future developments. As such, the results will likely be of interest to legislators, policymakers, investigative bodies, and law enforcement agencies, as well as the study participants, media, and general public.

Planned Impact

This project should benefit a wide range of academics and non-academics.

We have worked with two MPs and the Ministry of Justice lead on hate crime, Superintendent Paul Gianassi, in putting together this proposal (see Objectives in the Case for Support). We have also spoken with, and gathered data from, victims of online misogyny. These people are are very immediate potential beneficiaries of this research. However, through them we intend to reach out to a wider range of non-academic consumers of research. These will principally be in the areas of policy making and law enforcement. In general the project will produce research of relevance to a pressing contemporary issue that should be of interest to a range of organizations in the public and private sectors.

We would also argue that the corpus approach taken on this project provides, in itself, a substantial benefit for such users of social science research: the findings that emerge from the work of the project and our work with the our principal non-academic research user, Superintendent Paul Gianassi (see Pathways to Impact for more details), will represent a substantial advance on the results that users have had access to heretofore.

The public should also benefit from the outputs of this Centre in at least two ways: i) this project clearly touches on issues in which there is substantial public interest - we will respond to that public interest by ensuring that there is a timely communication of research findings to the general public; ii) the consumption of our research findings by policy makers in particular holds out the prospect of these findings having a beneficial impact on public policy, hence indirectly impacting upon the public.

The academics that will benefit from the Centre come not only from linguistics, but from a range of subjects in the social sciences, most notably Criminology. For researchers in Criminology in particular, we would hope to change their research practices by demonstrating the advantages of combining the corpus approach with other approaches to the analysis of language.

Ensuring that the project has the impact it promises is of vital importance to this proposal. In consequence we have worked both with a key researcher on hate speech in Criminology and with potential non-academic research users in designing this project to ensure that the project proposed here would be of genuine use to them. Similarly, we have made sure that the project itself contains a broad range of researchers and has, from its inception, access to a range of research consumers. By designing the project in consultation with those who should benefit from it, and by working throughout the life of the project with those who can gain from our work, we believe that we have ensured that the promise of this project can be realised for the good of the academic community, the private sector, the public sector and the general public.


10 25 50
Description We have made an extensive range of discoveries and developments on this grant. To summarise them as briefly as possible, to address out methodological RQs (RQs 4 and 5) we created software called FireAnt which is designed to help the analyst enhance the "signal" (i.e. the good quality, relevant parts) in a large, "noisy" dataset (i.e. a dataset that might contain a lot of irrelevant content).

With regards RQ1 (what can we learn about the rape threat trolls?), we established that trolls can come from a very wide variety of backgrounds - the stereotype of the "angry young man" is misleading at best and dangerous at worst.

RQ2 (why do they make rape threats?), proved particularly interesting, in that the data also suggests a wide array of motivations for trolling, ranging from a need for validation and attention, boredom, an escalation brought about by competition or conflict, and so on. We also found these individuals justifying their behaviour through a number of strategies including claiming that this freedom of speech, that they were just joking, that the targets had brought the attacks on themselves, that things written online weren't "real" and so shouldn't be taken seriously, and so forth. In other words, some of these individuals seemed to conceive of themselves as activists protecting free speech, whilst others framed their interaction as edgy or subversive, and as banter. Some of the goals that the data suggested included the silencing of women, especially where it concerned an attempt to stop a perceived over-correction of the imbalance in the inequality between men and women, along with simply having fun and joining in a widespread mob-attack on an apparently easy target.

RQ3 (what else do these trolls do?) also provided especially rich results. We found that some of these individuals seemed committed to undertaking a wide array of online abuse ranging from anti-Semitism and homophobia to misogyny and racism. Others, however, appeared to be committed purely to attacked prominent women, and still others appeared to be individuals who were aggrieved, for whatever reason, just with Criado-Perez and her campaign. We found that different users formed networks across Twitter in different ways. The "committed" trolls (for want of a better term) who appeared to know each other from external sites and worked together in a loosely organised group typically either produced low-investment accounts (new accounts with meaningless or abuse-specific usernames, no bio, picture, or followers, etc.) which they used for a few attacks and then abandoned, or employed pre-existing, abuse-focused accounts which had minimal details but would typically include in-group identifying markers in the bio, such as references to *sec/*sek (a reference to organisations such as lulzsec). Other loosely organised groups included those who used the shared lexicon, ideologies, and in-group markers as the Mens' Rights Activists (MRA) groups to be found across sites like Reddit (e.g. in the subReddit, The Blue Pill) and beyond, thereby providing evidence of ongoing misogynistic discourses and online communities across the wider internet.
Exploitation Route This work could well be taken forwards by a surprisingly wide range of others, e.g. government and justice (understanding the nature of online abuse, identifying escalation and assimilation, creating legislation suited to managing extreme abuse), schools and educational organisations (the impact of online abuse, the effects of certain types of response to that abuse, counselling students against abusing others), IT/comms industries (designing sites that minimise online abuse), and communities and social services (advising those who are receiving online abuse, creating policy that deals with extreme abuse),
Sectors Communities and Social Services/Policy

Digital/Communication/Information Technologies (including Software)



Democracy and Justice

Description We were invited by Twitter to their London Headquarters in November 2014 to present the findings from this project, and then in April 2015, TechCrunch published a piece (, which said: "Twitter is also making two policy changes aimed at tightening the screw on violent threats by widening what it said was an "unduly narrow" definition of threats before - which sounds very much like it's aimed at tackling terrorist propaganda spread via Twitter. "We are updating our violent threats policy so that the prohibition is not limited to "direct, specific threats of violence against others" but now extends to "threats of violence against others or promot[ing] violence against others. Our previous policy was unduly narrow and limited our ability to act on certain kinds of threatening behavior. The updated language better describes the range of prohibited content and our intention to act when users step over the line into abuse," writes Doshi. [] Twitter has been doing outreach on online misogyny. Last November it heard evidence from a team of academics affiliated with Lancaster University who had been researching online misogyny and rape threats made using Twitter for a research project that started in November 2013. The Discourses of Online Misogyny (DOOM) team was also aiming to develop methods and tools for analyzing online hate speech, such as building up linguistic profiles of abusers and identifying community-specific lexis in order to aid automating the detection of online abuse and abusers. It looks likely that Twitter is drawing on some of that research here." This change of policy was also widely publicised in the news, and in light of the article above, I emailed Nick Pickles, my contact at Twitter (the UK Head of Public Policy and the contact who invited me to present my findings there) to check that our research actually had informed their policy update, rather than that TechCrunch had just drawn this conclusion themselves, and he confirmed, in his words, that: "you may definitely believe it as that's a well briefed piece :)". In short, our most substantial impact has been to encourage Twitter to update their policy on online abuse. Additionally, there will have been a broad span of impact that we cannot measure or know the full extent of based on the sheer amount of media engagement we have undertaken throughout this project. I continue to give interviews and provide expert opinions and do not envisage this stopping or even decreasing over time. Indeed, as an example of the entirely unexpected types of impact that research can have, one entirely unforeseen and yet wonderfully positive development has been how many times I have now been contacted as a role model for women in science. I have been invited to give talks for organisations aimed at promoting the inclusion of women in science, I have been asked for advice by, and mentored women about the benefits and challenges of having a high media profile, and at this very moment I have an email currently waiting to be answered about being interviewed for an International Innovation publication that will also be including Athena SWAN Charter, Women in Science Australia, and Norway's Committee on Gender Balance and Diversity in Research. This latter type of impact is perhaps not one that ties into this project, but yet is a direct result of it, and therefore should not be ignored, even though there is no tidy place for it to fit into this report.
First Year Of Impact 2015
Sector Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Government, Democracy and Justice,Security and Diplomacy,Other
Impact Types Cultural


Policy & public services

Description Twitter updated its Terms of Service/abuse policy partially as a result of this project
Geographic Reach Multiple continents/international 
Policy Influence Type Contribution to a national consultation/review
Impact The change in question broadened the definition of abuse from simply directed, targeted abuse, to instigating others to be abusive. This substantially enabled Twitter users to take action against accounts who were either using their influence to orchestrate attacks, or who involved themselves in attacks. Prior to this update, such behaviour was technically acceptable.
Description ESRC Centre for Corpus Approaches to Social Science (Transition Review)
Amount £864,106 (GBP)
Funding ID ES/R008906/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 03/2018 
End 03/2024
Title FireAnt - software for downloading, filtering, and exporting Twitter data 
Description This software enables ordinary researchers to download tweets, and then filter and export the results into a format that they are most comfortable working with. At this point (2019) this tool has been shared with an estimated 10,000 researchers across a wide range of different disciplines. 
Type Of Material Improvements to research infrastructure 
Year Produced 2016 
Provided To Others? Yes  
Impact Based on a current Google Scholar search (conducted February 2019), FireAnt has been cited in publications 62 times. Given that the tool was only released in mid-2016, just under three years ago, that we did not have any budget for publicising its launch, and that publications can take a year or two to go into press, this seems an especially positive count. 
Title CCP corpora 
Description This is the complete corpus of tweets by and to Criado-Perez during July, August, and September 2013. 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact The research in this project could not be undertaken without it. This data provided all the results for the project. 
Description Mapping UK Far-Right Movements Online: Combining Analyses of Networks and Discourse 
Organisation International Centre for the Study of Radicalisation and Political Violence
Country United Kingdom 
Sector Public 
PI Contribution Mapping UK Far-Right Movements Online: Combining Analyses of Networks and Discourse Mark McGlashan of the DOOM project and CASS is working alongside partners at ICSR to implement a combination of Social Network Analysis, Corpus Linguistics, and Discourse Analysis to map the UK's online far-right movement, the communities that exist and develop and the language that they use.
Collaborator Contribution Contributed expertise in Social Network Analysis, Corpus Linguistics, and Discourse Analysis. Development of analytical tools and leading skills/knowledge transfer with regards to Social Network Analysis, Corpus Linguistics, and Discourse Analysis.
Impact x
Start Year 2014
Description Research partnership with Cognizant, India - monitoring risk and generating lead from public social media data for commercial purposes 
Organisation Cognizant Technology Solutions
Country United States 
Sector Private 
PI Contribution Contributed knowledge and skills in Corpus Linguistic for the application of methods of corpus linguistics for marketing purposes. A case study of the work can be found here:
Collaborator Contribution The partner (Cognizant) contributed knowledge regarding Social Network Analysis and the programming language R. Their contributions helped advance the skills and techniques implemented as part of the Twitter Rape Threats and Discourse of Online Misogyny project.
Impact Contributed to the production of a commercial research tool - a tablet app called 'QuantEye'. The app is yet to reach completion.
Start Year 2014
Title FireAnt 
Description FireAnt (Filter Input, Refine & Export) is a freeware tool for processing small, large, and very large tabular and hierarchical data sets, such as those generated by the Twitter API, Reddit, and so forth, for use in corpus, time-series, and network graph analyses. This software was developed in collaboration with Professor Laurence Anthony, Center for English Language Education in Science and Engineering, School of Science and Engineering, Waseda University, Japan. There are three main stages to the use of FireAnt: 1) Input: The user selects a file, or multiple files of the same type to upload. These datasets can be in DB/DATA, JSON, CSV, TSV/TXT, and XLS/XLSX formats that are UTF-8 encoded. 2) Filter/refine: Once loaded, the user can then filter the dataset(s) in various ways, such as by date, id, frequency, and column content. This allows the user to refine the dataset whilst being able to see and check the results in real time. 3) Export: Once the user is satisfied with the refined dataset, they can then choose three export options: (a) The user can simply output the data into DB/DATA, JSON, CSV, TSV/TXT, or XLS/XLSX. For instance, they may choose DB/DATA to create a more refined dataset to work on in FireAnt later. They may choose CSV or XLS/XLSX to create a dataset for manual analysis. Or they may choose TSV/TXT to create a dataset for analysis with corpus software such as AntConc, WordSmith, Wmatrix, and so forth. (b) Time-series: The user can output an incremental series of times and/or dates with values to plot a timeseries. They may choose XLS/XLSX to undertake this in software such as Excel, or they may choose CSV or TVS/TXT to undertake this in software such as R. (c) Network graph: The user can output a node-list and an edge-list with which to create a network graph in software such as Gephi or Cytoscape. FireAnt runs on any computer running Microsoft Windows (it has been tested on Win 98/Me/2000/NT, XP, Vista, Win 7, Win 8) and Macintosh OS X computers (it has been tested up to OS X 10.9 Mavericks). It was developed in Python and Qt using the PyInstaller compiler to generate executables for the different operating systems. At the time of writing, the current version of FireAnt (0.3.0b, released 08th August 2015) is still being tested and prepared for controlled release. Once this phase is completed and the software is ready for general release, it will be made available as a freeware here: 
Type Of Technology Software 
Year Produced 2015 
Impact This tool is yet to complete its final stages of testing, but once released, it is likely have a wide-ranging impact for those working in corpus linguistics, primarily because it is extremely useful for working with large, detailed datasets typical of the online environment. 
