ESRC Capital Funding: Social Data Science Lab - Continuation of Methods and Infrastructure Development for Open Data Analytics in Social Research

Lead Research Organisation: Cardiff University
Department Name: Computer Science

Abstract

The Social Data Science Lab has been established with the mission of democratising access to big social data among the academic, public and third sectors, and to support real-time social data analytics for research, policy & practice. The proposed capability project is designed to address existing technical and methodological shortcomings in our ability to marshal big social data for social research purposes. In particular, the project will provide enhanced and sustainable social media data collection and analysis technologies to academic, public and third sector researchers. This capability project will achieve this by:

1. Providing the required technical resource and expertise to: i) optimise existing Lab social media analytics technologies for more efficient social media data collection, transformation and analysis across operating platforms (Windows, Linux, Mac OS); ii) integrate existing social media analytics tools (such as demographic and text classification tools); iii) make existing Lab tools extendable by the researcher community; iv) adapt Lab tools to user requirements and changes in social media provider technologies (i.e API changes); and v) support researchers in their social media data and analysis needs via dedicated Lab training and working-papers;

2. Providing the required social science resource and expertise to: i) liaise with the researcher community to gather social media tool requirements; ii) liaise with ESRC administrative, local government and consumer big data centres; iii) write training materials and coordinate capability building activities; and iv) support researchers in their data and analysis queries;

3. Ensuring the required investigator time to: i) manage the optimisation and enhancement of Lab tools and to implement a 'sustainability' business model; ii) manage existing partnerships with public, private and third sector users and the various ESRC big data network centres in the UK and elsewhere; iii) develop new partnerships with data providers; and iv) inform and oversee the development of world-leading training and capability building in Big Social Data Analytics;

4. Exploring options for the sustainable processing of social media analytics within UK HE research infrastructure and providing an options paper for use by existing ESRC big data investments, and ultimately Phase 3 of the network.

Planned Impact

The project will have five main categories of beneficiary: (1) Academic communities in the fields of social science, computer science, health studies and medicine, and arts and humanities, (2) government agencies that have a remit to engage with big social data; (3) law enforcement agencies, (4) voluntary sector organisations, and (5) private corporations with an interest in big social data. The main activities to realising potential benefits to these groups are:

1. The provision of free access to Lab social media analytics technologies for not-for-profit use;

2. The provision of free access to Lab social media analytics capability building and training materials, including webinars and online support community;

3. The invitation to an International Conference on Computational Social Science;

4. The recruitment onto the MSc in Social Data Science Part-Time route;

5. The continued support of industry-Lab partnerships (with the likes of Admiral Insurance and Airbus);

6. The continued support of government-Lab partnerships (with the likes of the ONS Data Science Campus, Home Office, Ministry of Justice and the Department for International Development)

7. The recruitment of non-academic government, voluntary and industry members onto the Steering Committee for the Lab social media analytics capability programme.

The Social Data Science Lab will leverage its existing relationships to achieve these activities. Existing links include 1) Private sector: Twitter US & UK; Google UK; Airbus Group; Admiral Insurance; RAND Corporation; RAND Europe; Fujitsu and High Performance Computing Wales; Sage Publications; and NatCen Social Research, 2) Public sector: Ministry of Justice; Home Office; Food Standards Agency; Department of Health; Department for International Development; Office for National Statistics Data Science Campus; Welsh Government; College of Policing; Metropolitan Police Service; City of London Police; UK Data Archive, and 3) Third sector: Tell Mama; Community Security Trust; Race Equality First; Stonewall.

Lab social media analytics technologies have been used by over 1000 organisations in over 30 countries, including all UK Russell Group universities, several top US universities (including Stanford, Cornell and MIT) and many non-academic institutions (including BBC; Foreign and Commonwealth Office; Citizens Foundation Iceland; Girl Guides; Dept. Work and Pensions; Bolton Council; MySociety; Police Foundation; West Midlans Police; CPS; Shelter; Scottish Government; Dept of Health; ONS; South Lanarkshire Council; Community Security Trust; Cabinet Office; College of Policing; Public Health Canada; Salvation Army; Institute for Sustainable Communication; Detroit Crime Commissions; Understanding Animal Research; British Geological Survey; Medway Council; European Space Agency; UK MOD Army; National Response Center for Cyber Crimes (Pakistan); National Library of Scotland; Dept. for Culture Media and Sport; HEFCW; Carmarthen County Council; Ceredigion Council; Airbus Group; British Institute of Human Rights; McKinsey; US Army; Fair Trials Intl.; Turkish National Police; Intl. Civil Society Centre; and Public Health England). We will provide all enhanced social media analytics technologies to these organisations and continue to support data and analysis needs where required and possible.

Cardiff University recognises the Social Data Science Lab's impact to date, and is supporting its impact plan going forward by locating it within the Nesta sponsored Social Science Research Park (SPARK). See Pathways to Impact for further details.
 
Title COSMOS - Automation of the download request process. 
Description Users are able to request to download COSMOS application, and instead of administrator has to manually approve the download request and send link to latest version to the user to download, this process has been automated by building a CRON job and PHP script to handle the email received by the user and store it into google drive document and send out a welcome email with the link to download COSMOS. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? No  
Impact Saved a lot of time from development and administrators team and helped the research community to receive a faster response. 
 
Title COSMOS - Bug Tracker 
Description Any issues in design and coding that cause incorrect results are considered software bugs. In a software development life cycle, tracking bugs is one of the most important aspects. Where a user has encountered an issue and submitted it via the Help Desk, but the team could not resolve it immediately, then it is reported as a bug using the COSMOS Bug tracker. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? No  
Impact The Bug Tracker improves connectivity between COSMOS development team members and plays an important role in providing feedback to the user when the bugs are resolved. It is also proving key in helping the team prioritize issues during software development sprints, and in delivering a high-quality sustainable product. The Bug tracker will continue to provide the development team with information on how to fix and improve COSMOS over the full duration of the capability grant. 
 
Title COSMOS Help Desk 
Description The help desk was designed as a one-stop-shop support mechanism for those COSMOS users encountering issues with the platform. Users are able to submit a 'ticket' detailing their issue and track their submission/check status of their problem, to identify when it is resolved. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? No  
Impact We are now able to ensure a continued dialogue to resolve the issue and our team is able to build up a knowledge base around the errors encountered. User-driven feature requests can also be submitted to making it easier for us to feed into future enhancements and development plans. 
URL http://www.cosmos-support.net
 
Description Cardiff University Airbus Centre of Excellence in Cyber Security Analytics 
Organisation Airbus Group
Department Airbus Operations
Country United Kingdom 
Sector Private 
PI Contribution Burnap is the director of the Centre, Anthi is a core IoT researcher within the Centre. Burnap leads IoT research for Airbus in the context of Industrial IoT
Collaborator Contribution Airbus are providing support to build an industrial IoT testbed as part of the IoTDepends project - this will underpin the research co-produced by Cardiff University and Airbus
Impact £760k research project funded by Endeavr Wales to study intrusion detection and probabilistic modeling of cyber attacks on Industry Control Systems (SCADA); £1.8m EPSRC research project studying the impact of IoT and sensors embedded in products of the future to support automated "Chatty Factories" of the Future; Journal article in Computers and Security (Malware Classification and Machine Learning); Journal article in IEEE Computer (Goal Oriented Risk Modeling); Journal article research has been transitioned into enhanced products and services within Airbus (Malware Classification -> SOC, Risk Modeling -> Risk consulting business)
Start Year 2017
 
Title COSMOS - Automation of the download request process. 
Description Users are able to request to download COSMOS application, and instead of administrator has to manually approve the download request and send link to latest version to the user to download, this process has been automated by building a CRON job and PHP script to handle the email received by the user and store it into google drive document and send out a welcome email with the link to download COSMOS. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Impact Saved a lot of time from development and administrators team. 
 
Title COSMOS - Bug Tracker 
Description Any issues in design and coding that cause incorrect results are considered software bugs. In a software development life cycle, tracking bugs is one of the most important aspects. Where a user has encountered an issue and submitted it via the Help Desk, but the team could not resolve it immediately, then it is reported as a bug using the COSMOS Bug tracker. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Open Source License? Yes  
Impact The Bug Tracker improves connectivity between COSMOS development team members and plays an important role in providing feedback to the user when the bugs are resolved. It is also proving key in helping the team prioritize issues during software development sprints, and in delivering a high-quality sustainable product. The Bug tracker will continue to provide the development team with information on how to fix and improve COSMOS over the full duration of the capability grant. 
URL http://cosmos-support.net/reportbug
 
Description A meeting with UK Data Archive 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact UKDA are archiving New Forms of Data, including social media. From our two meetings, it was evident that COSMOS could be enhanced by adding the functionality to link to UKDA datasets containing tweet IDs, to 'rehydrate' them via the Twitter API. This enhancement would function in the same as a DOI that points to the dataset used to produce the results in an academic paper. The enhancement would also allow users of COSMOS to deposit tweet IDs into the UKDA. These rehydration and deposit functions would facilitate big social data reuse and study replication. The enhancement fits with UKDA, ESRC and the Social Data Science Lab strategic priorities, and therefore we have agreed to collaborate on this over the coming year.
UKDA are in need of documentation for data reusers on the technical, methodological and ethical reuse of social media data. The Lab has agreed to co-author this documentation with UKDA.
UKDA are trialing the use of their new High- Performance Computing infrastructure in the storage, management, and analysis of big data sources (currently smart energy meter data). It was agreed that the Lab work with UKDA on experimenting with social media analysis using this architecture. It may be possible to use this architecture to power some of the back-end COSMOS processes, speeding up analysis for heavy users significantly.
Year(s) Of Engagement Activity 2017
 
Description A meeting with public health of England 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact PHE is interested in developing a customized plugin to be used as part of COSMOS application, and the meeting held in March 2018 aims to collect requirements as part of our Sustainability plan for COSMOS.
Year(s) Of Engagement Activity 2018
 
Description COSMOS Awareness postgraduate students meeting 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Conducted different one-to-one meetings with various of MSc students from Computer Science and Social Science schools to demonstrates COSMOS analytic tool and encourage students to use as part of their dissertation project.
Year(s) Of Engagement Activity 2019
 
Description COSMOS Feedback Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact A workshop to understand the use of COSMOS and review COSMOS in detail, to identify useful features etc. and generate ideas for future improvements. Participant were COSMOS's users from around the UK and the UX researcher and COSMOS development team. The aim of the workshop to identify the main issues, future improvements to help in the next release.
Year(s) Of Engagement Activity 2018
 
Description COSMOS awareness and training session 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Encourage social science undergraduate students to use COSMOS as an analytic tools as part of their dissertation.
Hold an event to demonstrate COSMOS's features and future plan.
Support COSMOS's users with technical errors.
Use the user's feedback to include in the current COSMOS development phase.
Year(s) Of Engagement Activity 2019
 
Description COSMOS integration with Data Cymru 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact Data Cymru is a local government unit working as an entry point to different data sources in the UK. They working in developing a new API that facilitate accessibility to the data sources. COSMOS team had two meeting with Data Cymru team to discuss potential further integration. Providing a full demonstration around COSMOS's abilities and future plans. COSMOS and Data Cymru agreed on the following: - build a partnership with COSMOS team and encourage other data providers to do the same - Open a channel of communication with local community to present COSMOS - Data Cymru will provide an API to integrate with it - look into Research collaboration in areas like, public health, mental health and social networks.
Year(s) Of Engagement Activity 2019
URL http://www.dataunitwales.gov.uk/home
 
Description Different back-end options for COSMOS 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact In an ongoing process to expand COSMOS's ability, COSMOS team arrange a meeting with ARCCA and supercomputing.wales and ATOS company. As Cardiff University in process of purchasing a resources service from ATOS to extend their infrastructure. Atos is a European IT services corporation with its headquarters in Bezons, France and offices worldwide. It specialises in hi-tech transactional services, unified communications, cloud, big data and cybersecurity services. COSMOS team demonstrated a user-case scenario in usage of COSMOS as an analytic tool and possible integration of COSMOS into Atos or SuperComputing Wales infrastructure to allow Social Science researchers to extend their analytic abilities.
Year(s) Of Engagement Activity 2019
 
Description Meeting with ADRC-Wales 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact We will jointly explore the possibility of providing training to ADRC-W in Swansea around the topic of social data science (e.g. social network analysis).
COSMOS will be utilised as an in-house resource within ADRC-W to provide the function of social media data analysis and linking with administrative data.
We will deliver a seminar to the ADRC network on social media analysis (showcasing COSMOS).
We will co-develop an emotion extraction plugin for COSMOS at the request and design of ADRC-W.
Year(s) Of Engagement Activity 2017
 
Description Meeting with MOPAC and MPS 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Potential collaboration to explore Develop an Intelligent methodology to detect Spread of hate - this will form part of the model integration later
detect the network associated with the user's posting hate - explore network metrics, esp. the networks of haters and hated.
COSMOS requirements: -Top 10 hashtags - Targeted to a certain individual, how many people have been targeted attack? @metion is one way, but it will also pick up counter-speech and responses. -include a frequency of hate posts with @mentions included in the text.
Year(s) Of Engagement Activity 2018
 
Description Meeting with Welsh Water Board representative 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Explore potential collaboration with Welsh Water, by developing a customized plugin based on Welsh Water's requirements. To learn more about their customers especially, within 5 years there will be more companies to join into Water Sector. Therefore, Welsh water is looking to integrate more intelligent social networks tools to improve their engagement with their customers.
Year(s) Of Engagement Activity 2018
 
Description Providing COSMOS demo to Cardiff Metropolitan University staff members 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Providing a 2 hours presenting with a hands-on demo around COSMOS's features and future plans.
Year(s) Of Engagement Activity 2019
 
Description Providing COSMOS demo to Swansea University Staff member 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Providing a 2 hours presenting with a hands-on demo around COSMOS's features and future plans.
Year(s) Of Engagement Activity 2019
 
Description Providing a COSMOS presentation to Bristol University Staff member 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Providing a 2 hours presenting with a hands-on demo around COSMOS's features and future plans.
Year(s) Of Engagement Activity 2019
 
Description deliver a COSMOS workshop at the University of Edinburgh 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Lead Software Engineer and Research Associate delivered a one day COSMOS workshop on how to use COSMOS in research and teaching activities at Edinburgh University . A 30 participants attended from across the UK majority from across Scotland. The workshop was a hands on activity including a presentating from participant in their finding/feedback on using COSMOS. After the workshop, participated reported they will integrate COSMOS as a tool to be used in their research/teaching.
Year(s) Of Engagement Activity 2018