What are the odds? Capturing and exploring data created by online political gambling markets.

Lead Research Organisation: Swansea University
Department Name: School of Arts and Humanities


The 'What are the Odds?' project builds on the intuition that the odds offered on gambling markets inform us about participants' perception of the likelihood of possible outcomes for a given event. 'Long' odds, which offer a high return, tend to be offered on (perceived) unlikely outcomes, while 'short' odds, which only offer a very small return, indicate that an outcome is considered likely. Furthermore, odds on offer vary over time as new information becomes available.
'What are the Odds?' leverages the fact that most contemporary online gambling websites offer odds on political outcomes. Because these odds are published online, it is possible to gather them automatically, and to compile records of how the markets fluctuate over time. The project will gather and analyse gambling market data concerning elections and referendums, the key means by which citizens influence policy in representative democracies.
From an academic viewpoint, online political gambling markets provide unique data about the shifts that take place during election campaigns. Such shifts are currently measured via polling data - however, polls are highly expensive, tend to be conducted at irregular intervals, and polling houses differ in sampling and weighting methodologies. Online gambling markets provide free, minute-by-minute snapshots of outcome likelihoods during a campaign that are all generated by the same mechanism. Online gambling markets are also often available for individual candidates within their respective constituencies - making them a rich new resource for campaign scholars.
However, gathering and organising such data is not straightforward, especially given the number of sites offering political betting markets - each site has a specific format and features, which must be taken into account. Furthermore, in line with the research call, these data are 'big' - in terms of volume (for instance, the UK general election dataset size is estimated at 9.5 GB) and velocity (the speed at which markets can react to external events). Storing and analysing such data require bespoke tools and techniques. For this reason, 'What are the Odds?' is a collaboration between political scientists and computer scientists, whose key goal is to allow both the research team and other researchers to capture and analyse the data created by online political gambling markets. This will be done by creating a bespoke project website with a 'research' face containing the tools, techniques and data generated by the project.
'What are the Odds?' will facilitate research activity that is inherently interdisciplinary - combing political science interests (psephology, public opinion analysis and electoral forecasting) with computer science concerns (database management, algorithm development, and data visualisation). The project is also international in scope, including researchers from both the UK (Drs. Stephen Lindsay and Matthew Wall from Swansea University) and Ireland (Dr. Rory Costello from the University of Limerick) and will collect data on elections that are scheduled to take place in the UK and Ireland during or shortly after the research period.
The requested funding will be used by the research team to develop reliable, open-source tools for collecting and analysing minute-by-minute updates from electoral markets from a range of websites. It will allow us hire a technician to work on the details of these tools and the project website over a 15 month period. We will seek to promote this exciting new source of political information to the general public (via an easy-to-understand web page and a media promotion campaign) and to scholars (by developing the project website's 'research face', presenting our work at departmental seminars at Swansea University and the University of Limerick, at three international conferences, and in peer-reviewed journal publications).

Planned Impact

The 'What are the Odds?' project fits neatly with the description in the call of 'smaller projects' as: 'pilot or scoping studies, which, if successful, could then form the basis for a larger more robust project in the future.' This funding will allow us to bridge the technical and methodological issues needed to make future research employing online political gambling markets possible. Such future research will seek to investigate the potential of these data for political analysis, including campaign event analysis, candidate behaviour analysis and electoral forecasting. Furthermore, both the combined use of automated scraping with crowd sourced checking and the visualisation of political market trends represent fruitful avenues for further interdisciplinary funding applications to be developed by the team.
The profile of the research team also fits the call specifications - the principal investigator and co-investigators are all Early Career Researchers, according to the AHRC's criteria. Thus, management of and participation in 'What are the Odds' will develop the skill sets of all team members, enhancing their academic experience levels, capacity to work on further, related projects, and employability.
1. Wider Public
In terms of societal impact, a key objective of 'What are the Odds' is to inform and educate members of the public in the UK and Ireland about the information contained about electoral probabilities in online gambling markets.
Members of the British and Irish public interested in following campaigns and elections.
The project website's 'public face' will feature regularly updated and easily understood representations of upcoming elections from live online gambling markets, as well as multimedia explanatory materials.
The public will be alerted to this site via targeted media engagements in the UK (Wall) and Ireland (Costello) during the campaigns to be studied in this project.
2. Academia.
In order for analyses based on gambling market data to influence and inform media coverage and public understanding of elections, it is necessary that such analyses be disseminated within academia and scrutinised by academic experts.
'What are the Odds?' will primarily impact on scholars engaged in electoral studies, particularly within the UK and Ireland. The tools and techniques developed in the project will also impact on social sciences more widely, and on the international computing science research community.
The project website's 'research face' will allow academics to engage with the datasets that this project will create for elections in Ireland and the UK, and/or to create bespoke datasets for other elections or any other live online gambling market. The Information Gathering Program (IGP) source code will also be published on the project website, representing a valuable resource for computer scientists interested in large scale page scraping.
Conference strategy
The principal investigator and each co-investigator have targeted high profile conferences to disseminate this research and to publicise the project website - Wall will attend the UK's 2014 Elections, Public Opinion and Parties (EPOP) Conference, which brings together leading international electoral studies specialists as well as the UK's major polling companies. Dr. Costello will present at the 2014 Political Studies Association of Ireland (PSAI) Annual Conference, while Dr. Lindsay will present at the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Ubicomp) conference, to a an international and prestigious audience of computer scientists.
Paper strategy
As discussed in the 'Objectives' document, the project will target research papers in Political Analysis and ACM Transactions on Interactive Intelligent Systems - which are high impact peer-reviewed journals within political science and computing science, respectively.


10 25 50
Description 1) An Information Gathering Program: The project has developed an algorithm that allows us to gather data from over 20 gambling websites on key political events - this project has focused on elections - namely the European Parliament elections, the Scottish Independence Referendum and the upcoming UK and Irish General elections. The data that we collect captures the average 'price' available on these markets (one example would be UKIP's probability of winning the most votes in the 2014 European Parliament Elections) over a highly granulated series of time points, our current 'standard' procedure captures fresh data every 15 minutes.

We are also refining procedures for making data collection more robust by adding 'crowd-sourced' checking and updating features via the online labour market website 'www.mechanicalturk.com'. Current work is focusing on determining the precise protocol for this verification such as how often the data's accuracy should be checked, what should be done if a problem is detected automatically and how financial constraints on the collection of data factor into the reliability of the verification process (i.e. the more we pay for a check the more reliable the results but what is the point where the most value can be achieved for a single check as opposed to paying for several smaller checks). A paper detailing this work is being developed by Dr. Lindsay for the 2015 Association for Computing Machinery's annual 'CHI' conference (Seoul, Korea).

2) A data storage and analysis website - this site (www.tellmetheodds.com) is currently in 'Beta' stage, and features a facility for capturing changes in identified electoral gambling markets using data hosted on the project server. The next months of the project will be devoted to developing the public facing aspect of this site in order to roll out the project's impact strategy for the 2015 UK general election (which will take place in May, 2015).

3) A 'big' dataset - The project is well on course with regard to collecting and presenting for public and academic analysis a 'big' dataset of election gambling markets over a huge number of time-points for the national-level elections covered in this project. These datasets are currently hosted on the project server and a select sample can be accessed on the project's beta website. The truly 'big' aspect of the data will develop in the run-up to the UK and Irish general elections, as constituency-level markets become available. Such markets have emerged for the UK elections in recent weeks at the time of writing, and the project is beginning the process of collecting data from them.

4) Political Science Applications: The project has developed analytical techniques allowing us to compare these data to the 'standard' data that is used to evaluate electoral likelihoods and fluctuations during campaigns - namely, polling data. These techniques have been presented via academic papers at two major political science conferences - the 2014 Elections, Public Opinion and Parties conference in Edinburgh, Scotland and the 2014 Political Studies Association of Ireland Conference in Galway, Ireland. The project is also developing techniques for generating forecasts on the basis of constituency level markets, building on previous work by the Principal Investigator in the 2010 elections and the contributions of interested collaborators.

5) Applied analysis - the Scottish independence referendum. A paper has been submitted to the British Journal of Political Science (BJPS) and is currently under review explaining our analysis of betting odds movement in the 2014 Scottish Independence Referendum. Our key finding is that the second leaders' debate between Alex Slamond and Alistair Darling significantly moved the odds towards the 'Yes' side - a movement that was corroborated and exacerbated by subsequent movement in the polling data.

6) Dr Matthew Wall has written a chapter that will be included in the collection: 'Sex, Lies and the Ballot Box 2'. This paper looks at how the markets were by and large correct in predicting (with 70% certainty) a Conservative victory in the 2015 General Election, but, like the polls, overwhelmingly were incorrect in predicting a hung parliament.
Exploitation Route To date there have been several major additional applications for our findings from others.

Firstly, our work and findings are of interest to those in the electoral forecasting community, for example Dr. Chris Hanretty of the University of East Anglia has written a post looking at the forecasting possibilities of the type of data collected by this project using previous work by the Principal Investigator on the University of East Anglia blog (http://www.ueapolitics.org/2014/07/14/general-election-2015-what-do-the-bookmakers-say/).

There is also a considerable non-academic interest in the analysis of data that can provide insight on future electoral likelihoods. For instance, a representative of the company Frontier Economics has contacted the project team and we have shared some of the project data with them with a view to seeking further collaboration for future research and funding bids.

Secondly, our work has piqued the interest of some scholars working in the field of sentiment analysis on social networking sites who are also focused on election campaigns. Following the presentation of a project paper at the Political Studies Association of Ireland, the project team was approached by Dr. Jane Suiter of Dublin city University, who conducted a sentiment analysis of Twitter posts around the 'indyref' hashtag on Twitter throughout the Scottish Inependence Referendum campaign. We are currently exploring a potential paper that would integrate our analysis of polling and electoral market data with social media sentiment analysis data to analyse the temporal dynamics of the Scottish Independence Referendum campaign.

The 'What are the Odds?' team have also been invited to present our project at the This&THAT Camp Sussex Humanities Lab in May, 2016. This will be an opportunity to advertise our work to a wider audience of academics and practitioners.

In 2020, the reserach team have submitted an application to the CHERISH-Digital Economy Centre at Swansea University to create a podcast series applying the insights of this research to the 2020 US Presidential Election campaign.
Sectors Digital/Communication/Information Technologies (including Software),Education,Financial Services, and Management Consultancy,Government, Democracy and Justice

URL http://tellmetheodds.com/
Description In terms of impact on the academic and commercial sectors, the project has successfully presented its findings at two major political science conferences: the 2014 Elections Public Opinion and Parties Annual Conference (Edinburgh, Scotland) and the 2014 Political Studies Association of Ireland Annual Conference (Galway, Ireland). The project was well-received at both conferences - making an impact on the academic electoral forecasting community particularly at the Elections, Public Opinion and Parties conference, where a large number of the members of this community were present. The project used this opportunity to present the project's beta website, and several scholars commented that they would seek to use it in their forecasting activities in the run-up to the 2015 election. The project has also proved of interest to the commercial sector, and the has received a contact from the company Frontier Economics, with whom we have shared project data with a view to collaborating on future research and funding bids. The project will present at the This&THAT Camp in Sussex Humanities Lab this May, to an audience of academics and practitioners - seeking to expand impact and build a network of future collaborators. The project has two published academic outputs - the first is a paper on the Scottish independence referendum campaign published in Electoral Studies second is a chapter in the popular 'Sex, Lies and the Ballot Box' book series. The book chapter was the subject of an article published in Bloomberg.com at https://www.bloomberg.com/news/articles/2016-10-07/betting-markets-can-t-save-you-from-political-polling-s-problems Finally, Dr. Matthew Wall conducted a podcast series for the 2020 US Presidential elections using the techniques developed in this project to inform the real time discussion of a major political campaign. This resulted in the gambling company Betfair.com reaching out and talks about using their data for further analysis and/or coverage of politics. My work led to media appearances on Sky News, BBC Radio Wales, and Times Radio. An article profiling this work in the Conversation written by myself and the podcast co-hosts attracted over 50,000 view and can be seen at this link: https://theconversation.com/biden-or-trump-betting-markets-are-more-cautious-than-polls-in-predicting-the-2020-us-election-149294
First Year Of Impact 2015
Sector Financial Services, and Management Consultancy,Government, Democracy and Justice
Impact Types Cultural,Economic,Policy & public services

Description Podcast series on 2020 US Presidential Election 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact I co-hosted a podcast series called 'Horse Race Politics: The 2020 US Presidential Election' with two colleagues from our media studies Department. It was a proof of concept that the type of analysis pioneered in this research project could inform real-time election coverage. We had over 1,000 unique listeners to the podcast series according to our metrics, and the series resulted in an article on the website The Conversation that attracted over 40,000 readers.
Year(s) Of Engagement Activity 2020
URL https://podcasts.apple.com/gb/podcast/horse-race-politics/id1532952719#episodeGuid=eef86446-33f6-420...