ISIS: Protecting children in online social networks

Lead Research Organisation: Lancaster University
Department Name: Computing & Communications

Abstract

The aim of the Isis project is to develop an ethics-centred monitoring framework and tools for supporting law enforcement agencies in policing online social networks for the purpose of protecting children. The project will develop natural language analysis techniques to help identify paedophiles from chat logs and monitoring mechanisms that can be non-invasively attached to file sharing systems for identifying the distributors of child abuse media. The ethical issues associated with such monitoring activities will be rigorously studied and consistently fed back into the development of the framework and tools. The project results will be used and evaluated by the Child Exploitation and Online Protection (CEOP) centre as part of their own policing activities.Recent years have seen a rapid rise in the number and use of online social networks, e.g., chat and file sharing systems. These social networks pose two significant risks in terms of child exploitation:1. Paedophiles predating on childrenChildren actively participate in chat rooms and web-based communities. Paedophiles can use such forums to predate on children (highlighted by the arrest and conviction of Lee Costi [Guardian, 23/06/06]), or even to plan paedophilia-related activities (illustrated by the conviction of paedophiles who were using such systems to plan child abuse [Guardian, 6/02/07]). These concerns are reflected by the formation of the Virtual Global Taskforce, the launch of the CEOP and Scottish legislation to criminalise the 'grooming' of children in chat rooms.2. Paedophiles distributing and sharing child abuse mediaPaedophiles can formulate their own social networks using mechanisms, such as file-sharing systems, in order to distribute and share child abuse media. A recent study at Lancaster University found that 1.6% of searches and 2.4% of responses on the Gnutella peer-to-peer network relate to illegal sexual content (including child abuse media). Given the system's scale, these results suggest that, on the Gnutella network alone, hundreds of searches for illegal images occur each second. Isis aims to address three major research challenges in this context:1. How to identify active paedophiles across online communities?Paedophiles often masquerade as children in order to establish contact with potential victims and gain their trust. Distinguishing the innocent interaction amongst children or amongst children and adults from such predatory advances is a non-trivial task. At the same time paedophiles may use multiple online identities and known paedophiles may move to other online social networks upon detection in one network. It is, therefore, vital that once a paedophile is detected in one network, s/he can be successfully detected in other networks which s/he may attempt to employ for grooming children. 2. How to identify the core distributors of child abuse media?The key research challenge is to accurately identify child abuse media from the plethora of perfectly legal material that exists within file sharing systems. Paedophiles often use specialised vocabulary to describe their shared media/a vocabulary that evolves and changes over time/and operate over different file sharing networks. Any monitoring framework must be non-invasively attachable to existing file sharing systems given the wealth of such systems and clients available today. Such monitoring tools must also be able to distinguish core distributors of such media from mere users to help law enforcement agencies in tackling the problem at its roots.3. How to ensure that such developments maintain ethical practices?The development of such monitoring and analysis techniques raises a number of ethical challenges pertaining to utilising the framework and tools in a beneficial way for child protection while protecting innocent users of online social networks and safeguarding their privacy.

Publications

10 25 50
 
Description Online child protection is a key concern of our time. Online social networks are extremely popular with children, and the ubiquity and accessibility of these networks give child sex offenders easy access to potential victims. The scale of the problem is huge:
• 50% of teenagers report having given out personal information online, and 10% have engaged
in physical meetings with strangers following online interactions (EU Kids Online project -
summarising over 400 studies across Europe);
• 13% of children in London report occasions on which they believe they had been talking online
to an adult posing as a child (London Metropolitan Police).
In attempting to combat online sex offenders, the cognitive load on law enforcement investigators is incapacitating because analysis of data from social networks is currently predominantly manual, which simply does not scale. This is because existing tools are primitive, typically offering only data extraction and simple keyword search capabilities. There is therefore an urgent need for more sophisticated tools that can efficiently carry out higher-level analyses of vast quantities of online data and report to investigators at the level of personas and behaviours.

The Isis Project focused on the problem of criminals hiding behind multiple identities (including, crucially, adults posing as children). The key tangible output of the project was an analysis suite called the Isis Toolkit. This was underpinned by 3 key research contributions:

Algorithms to establish a stylistic language ???fingerprint of potential suspects or victims: hese fingerprints can be overlaid to determine whether one person is hiding behind a single
persona or if multiple persons are sharing a single persona;

Algorithms to determine the age and gender of a person behind a digital persona: this is achieved by synthesising the stylistic language fingerprint with additional markers extracted
using natural language analysis techniques;

Algorithms to determine online interaction patterns: this involves analysing conversational structures and language patterns (e.g. signature moves when signing off from a conversation, or frequently used words and phrases) to determine a specific persona's identifying characteristics.
Exploitation Route The research has transformed the field of online cybercrime and, in doing so, has created a new field of study, that of online digital persona analysis. The results from the research are impressive: the Isis Toolkit can detect masquerading tactics with a high degree of accuracy: for example, detecting when an adult is masquerading as a child with an accuracy of 94% (compared to children participating in controlled experiments).
Sectors Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education,Security and Diplomacy

URL http://www.comp.lancs.ac.uk/isis/
 
Description The research led to impact in four areas: 1. Collaborative research. The underpinning research was collaborative and profoundly cross- disciplinary, covering areas including computer security, linguistic analysis, HCI, law/ethics and social science. The underlying natural language processing results stemmed from a long-running, very successful, cross-disciplinary collaboration between Computer Science and Linguistics at Lancaster (under the auspices of the UCREL Research Centre). 2. Research with user communities. The research involved real-world deployments with law enforcement agencies, and direct engagement with schools. In particular, live trials were held with various UK law enforcement agencies. In addition, teaching sessions were organised at the Queen Elizabeth School, Kirkby Lonsdale (2009-2010) and at the Lancaster Girls Grammar School (2011) to help children understand the masquerading tactics utilised by online sex offenders. 3. Commercialisation. A spin-out company was created with the support of the KE services that are embedded in the School of Computing and Communications (in InfoLab). 4. The work has contributed strongly to the national and international debate around Internet governance and online safety. Beneficiaries: Policy makers are the prime beneficiaries in this area. A policy paper was prepared for the BCS and presented to Alun Michael, MP (2009), and subsequently selected as the (single) UK contribution to the 2009 and 2010 Internet Governance Forums (in Sharm-Al-Sheikh and Vilnius respectively). Written evidence was also provided to the Commons Select Committee on Education (2010). The research also contributed to the Proposal for a Directive of the European Parliament and of the Council on combating the sexual abuse, sexual exploitation of children and child pornography, repealing Framework Decision 2004/68/JHA (COM/2010/0094). The research also provided a case study in a report requested by the European Parliament's Committee on Gender Equality proposing the extension of the Isis Toolkit to assist in the detection and management of cyber coercion and rape of women and girls. Secondary beneficiaries are the general public who benefit from our significant efforts to raise the awareness of online protection issues through the media. The project was also selected as one of the 100 big ideas of the future in a joint report by RCUK/Universities UK. The project also led onto a European Safer Internet Project, iCOP on techniques for identifying originators of child abuse media on peer-to-peer networks, which had a further impact in terms of take-up across a range of European law enforcement agencies.
First Year Of Impact 2008
Sector Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education,Government, Democracy and Justice,Security and Diplomacy
Impact Types Societal,Economic,Policy & public services

 
Description EPSRC
Amount £343,912 (GBP)
Funding ID EP/J005053/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2012 
End 05/2014
 
Description EPSRC
Amount £200,396 (GBP)
Funding ID EP/I016546/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 11/2010 
End 04/2012
 
Description European Commission - Belgium
Amount £405,000 (GBP)
Funding ID SI 2601002 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 06/2011 
End 11/2013
 
Description European Commission - Belgium
Amount £405,000 (GBP)
Funding ID SI 2601002 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 06/2011 
End 05/2013