Responsible AI for Inclusive, Democratic Societies: A cross-disciplinary approach to detecting and countering abusive language online

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Toxic and abusive language threaten the integrity of public dialogue and democracy. Abusive language, such as taunts, slurs, racism, extremism, crudeness, provocation and disguise are generally considered offensive and insulting, has been linked to political polarisation and citizen apathy; the rise of terrorism and radicalisation; and cyberbullying. In response, governments worldwide have enacted strong laws against abusive language that leads to hatred, violence and criminal offences against a particular group. This includes legal obligations to moderate (i.e., detection, evaluation, and potential removal or deletion) online material containing hateful or illegal language in a timely manner; and social media companies have adopted even more stringent regulations in their terms of use. The last few years, however, have seen a significant surge in such abusive online behaviour, leaving governments, social media platforms, and individuals struggling to deal with the consequences.

The responsible (i.e. effective, fair and unbiased) moderation of abusive language carries significant practical, cultural, and legal challenges. While current legislation and public outrage demand a swift response, we do not yet have effective human or technical processes that can address this need. The widespread deployment of human content moderators is costly and inadequate on many levels: the nature of the work is psychologically challenging, and significant efforts lag behind the deluge of data posted every second. At the same time, Artificial Intelligence (AI) solutions implemented to address abusive language have raised concerns about automated processes that affect fundamental human rights, such as freedom of expression, privacy and lack of corporate transparency. Tellingly, the first moves to censor Internet content focused on terms used by the LGBTQ community and AIDS activism. It is no surprise then that content moderation has been dubbed by industry and media as a "billion dollar problem." Thus, this project addresses the overarching question: how can AI be better deployed to foster democracy by integrating freedom of expression, commitments to human rights and multicultural participation in the protection against abuse?

Our project takes on the difficult and urgent issue of detecting and countering abusive language through a novel approach to AI-enhanced moderation that combines computer science with social science and humanities expertise and methods. We focus on two constituencies infamous for toxicity: politicians and gamers. Politicians, because of their public role, are regularly subjected to abusive language. Online gaming and gaming spaces have been identified as private "recruitment sites"' for extreme political views and linked to off-line violent attacks. Specifically, our team will quantify the bias embedded within current content moderation systems that use rigid definitions or determinations of abusive language that may paradoxically create new forms of discrimination or bias based on identity, including sex, gender, ethnicity, culture, religion, political affiliation or other. We will offset these effects by producing more context-aware, dynamic systems of detection. Further, we will empower users by embedding these open source tools within strategies of democratic counter-speech and community-based care and response. Project results will be shared broadly through open access white papers, publications and other online materials with policy, academic, industry, community and public stakeholders. This project will engage and train the next generation of interdisciplinary scholars-crucial to the development of responsible AI.

With its focus on robust AI methods for tackling online abuse in an effective and legally-compliant manner to the vigour of democratic societies, this research has wide-ranging implications and relevance for Canada and the UK.

Planned Impact

Main Beneficiaries:

1) The public: The prevalence of cyber abuse has lead to many government and industry attempts to curb its occurrence through prevention and policy; however, these attempts are hindered by the massive, dynamic volume of online content, as well as impeded by the largely ineffective and time-consuming nature of current abuse moderation methods. The project seeks to address these challenges while also considering issues of content moderation biases that tend to disproportionately tag certain individuals' and communities' language as toxic. These biases affect public dialogue, democratic participation and certain legal rights, such as freedom of expression, equality and privacy rights.

2) Policy makers and NGOs: The results generated by this project will help policymakers (e.g, economic diversification and innovation, justice, privacy, gender and equality) and NGO/community stakeholders (e.g., Amnesty, Reporters without Borders) establish guidelines for addressing online abusive language and inform them of the impacts. It will also provide alternative responsible (effective, unbiased and fair) methods for countering abusive language. Research results will contribute to a more balanced and democratic moderation of political dialogue and engagement while protecting against abuse of politicians and users.

3) Technology companies: Companies such as Intel are seeking to work with academics and NGOs to address abuse-prevention, especially as policies and regulatory frameworks are being developed. Gaming is also an important site for the tech industry, with a >4% yearly growth globally. The community of gamers is growing more diverse (~50% women in Canada in 2018). However, gaming can be a very toxic environment in terms of sexism, racism and other discriminatory forms of abuse, which ultimately limits the size of the gaming market.

4) Law enforcement agencies and social media companies: The responsible NLP methods
arising from this project could be incorporated in existing tools, helping law enforcement agencies and
social media companies detect and counter online abuse in real time.

5) Media companies and stakeholders engagement: Through previous projects, we have already established and will leverage collaborations with Buzzfeed, BBC News, ITV, Reuters Institute for the Study of Journalism and Google; and promote research results through the Centre for Freedom of the Media/UNESCO Journalism Safety Research Network.

6) Early career researchers (ECR)/students: the project will help advance emerging scholars' research trajectories by offering training in interdisciplinary research skills, widening collaborations in the UK, Canada, and the USA, and engaging them in cutting-edge research methods with major social impacts and benefits.

Impact and Outreach Activities:

To achieve maximum impact, project results will be made open-source. Project results will contribute to more responsible AI methods to detect online abusive language. This in turn contributes to increased users' confidence through platforms' greater compliance with relevant policies, human rights and legal frameworks and reinforces key socio-economic and Digital Economy areas, namely online gaming, social platform companies, digital journalism and content moderation technologies and services.

Policy impact will result from knowledge shared in Canada, the UK, and the US (through AI NOW). We will draw on the UK PI's experience who has just submitted written evidence on online abuse of UK MPs to the UK Parliamentary inquiry on Democracy, free speech and freedom of association and harness the Industrial and Parliament Trust. The Canada PI will share new findings with a network of over 35 collaborating scholars and policy/community/industry partners with the Canada 150 Research Chair/SFU Digital Democracy Group.

Publications

10 25 50