Making Numbers Meaningful

Lead Research Organisation: University of Birmingham
Department Name: Department of English Literature

Abstract

Much of our everyday activity depends on our ability to comprehend and make decisions based on numerical information. However, many people struggle with innumeracy, the numerical equivalent to illiteracy. A 2014 report for the National Numeracy charity has estimated the cost of low numeracy in the UK to be £20 billion per year. The proposed project shifts the burden of innumeracy from the consumer of numerical information to the communicator, recognising the importance of the external presentation of numerical information in language and graphs, and how particular communicative strategies can be unintelligible, biasing, or outright distorting the truth.

The project's starting point is that in order to combat innumeracy, we first need to understand what it is that people actually do with numerical communication on a daily basis. For this project, I will assemble an interdisciplinary team of researchers to conduct the so-far largest investigation into numerical communication across multiple communication channels and multiple forms of media. The proposed project innovates by marrying existing research in psychology, education research, and statistics, with methods and insights from linguistics and computer science, thus transforming the study of numerical communication into a thoroughly interdisciplinary endeavour.

The project takes the insights generated from the large-scale descriptive analysis of naturally occurring language back into the lab, so that experiments into the most effective and transparent communication strategies can be modelled closely after numerical communication "in the wild", thus bridging laboratory conditions with natural conditions. These experimental investigations into the cognitive effects of particular presentational formats will be used to develop automated ways of classifying the numerical accessibility of documents.

The research will make significant academic impact by showing how numerical communication depends on multiple communication systems that interact with each other and therefore need to be studied in an integrated fashion. The project will also impact several different academic fields by innovating methodologies via the use of machine learning tools to automatically extract numerical information from language and graphs.

The proposed research has wide-ranging benefits to non-academic stakeholders. This impact will be realised by working closely together with data analysts from commercial businesses on innovative ways of making pressing societal issues (such as climate risks or health risks) more accessible to people with low numeracy. In addition, the project aims to improve the training of data analysts so that more attention is paid to matters of transparent communication. The long-term goal is to establish a lasting legacy of data analytics training that leads to improvements in how numerical information is communicated across a large range of data-dependent business sectors.

Planned Impact

The non-academic beneficiaries of this research are:

1) Commercial private sector beneficiaries, specifically data analysts working in the tech industry and data-driven sectors (data science consultancy, finance, automotive, advertising, medical)

2) Journalists, especially those who rely on communicating statistics to the wider public (such as science writers or journalists reporting on poll results)

3) The wider public, especially people with low numeracy and/or high math anxiety who struggle with making informed decisions on numeracy-dependent issues

The primary benefits are:

1) Filling a gap in the existing training of data analysts, who are not extensively trained in matters of communication even though they rely on being able to communicate statistical results efficiently and clearly with various stakeholders

2) Increasing the accessibility of numerical information that is produced by data analysts so that the information can be more easily understood by private sector decision makers who rely on accurate and transparent reporting

3) Making pressing societal issues (climate change, health risks etc.) more easy to understand to people with low numeracy and/or high math anxiety by working together with data analysts to innovate accessible means of communication

Planned impact activities will be co-organised with industry partners and are scheduled for all four years of the scheme. This includes impact activities that are of immediate benefit to participants of local workshops and "hackathons" in Birmingham (years 1-4). The outputs of the hackathons are also of immediate benefit to the general public (being able to understand complex numeracy-dependent issues via accessible information displays). The insights generated by these local activities will be used to inform impact activities that are aimed at changing the communicative practice of data analysts across the UK and abroad on a more long-term basis (year 4: MOOC, evidence-based guide, +3 period: popular science book). Additional activities in the +3 period (creating a data communication consultancy firm together with Innovation Birmingham) will transform the activities of the first four years into a lasting legacy of data communication training.

Publications

10 25 50

publication icon
Cwiek A (2021) Novel vocalizations are understood across cultures. in Scientific reports

publication icon
Cwiek A (2022) The bouba/kiki effect is robust across cultures and writing systems. in Philosophical transactions of the Royal Society of London. Series B, Biological sciences

publication icon
Fischer M (2021) More Instructions Make Fewer Subtractions in Frontiers in Psychology

publication icon
Winter B (2021) Size sound symbolism in the English lexicon in Glossa: a journal of general linguistics

publication icon
Winter B (2021) Rethinking the frequency code: a meta-analytic review of the role of acoustic body size in communicative phenomena in Philosophical Transactions of the Royal Society B: Biological Sciences

 
Description The Making Numbers Meaningful project has produced a number of important findings about the nature of numerical communication. In line with the project's goal to study numerical communication 'multimodally' - focusing on different modalities of communication and their interaction - we have three new experiments which demonstrate that co-speech gestures produced during numerical expressions can change what number an audience has in mind. For example, moving the hands outwards, away from the torso, while saying "400 people were at the protest; several of them got arrested", leads people to estimate a higher number of people who got arrested. In an upcoming new review paper "Multimodality matters in numerical communication", we also discuss that such gestures have more plausible deniability, and their influence on numerical judgments can be quite treacherous, often going unnoticed, compared to speech.

In a separate strand of the project, we have investigated how people strategically manipulate numerical language to make statistics appear as good or bad as possible. We performed two experiments where participants viewed the results of school exams, with varying numbers of students performing answering more or less questions correctly. We then asked our participants to frame the exam results as positively as possible, or as negatively. Results show that people's language changes considerably, and we especially noted a higher proportion of "informationally weaker" quantifiers, that is, expressions that are relatively more 'vague', such as "some" or "most", as opposed to "all" or "none". These results show that vagueness is strategically exploited by English speakers when communicating numerical information.

In a final strand of the project, we look at a phenomenon known as "addition bias", or alternatively, "subtraction neglect". It has previously been found in behavioral research that English speakers systematically overlook opportunities to take things away, even if this may be the preferred option. For example, when asked to review a manuscript, reviewers will almost suggest to add new sections of text, rather than to reduce them. In our upcoming paper (accepted in Cognitive Science), we show that this bias is reflected deeply in the English language. Addition-related words such as "add", "more", "plus" as opposed to "subtract", "less", "minus" etc. are much more frequent, and they occur in more positive contexts. We also use GPT-3 to show that large language models, when given prompts to "improve" something, will tend to suggest additions, rather than subtractions. Moreover, we show that the meaning of seemingly neutral verbs like "to improve" or "to change" already implicitly suggests addition. This has serious consequences for our everyday life, as overlooking opportunities to subtract can be detrimental, such as when bureaucracies will tend to proliferate as people only consider adding new forms or processes. We show that these biases are embedded in the English language, and that we therefore need to be extra careful not to fall trap to subtraction neglect.
Exploitation Route Our gesture results are the first to demonstrate empirically that gestures matter in numerical communication between adults. This opens up a whole new world of research as there are many different types of gestures, and many different types of communicative contexts in which gesture matter. The study of co-speech numerical gestures is ripe with follow-up studies, also to investigate how and whether the effect of gesture on numerical communication and data-driven decisions can go unnoticed.

Our student exam result experiments show that vagueness is implicated in communicating numerical information strategically. This makes clear predictions for future research: whenever there is a choice between a precise and a vague or approximate expression, a speaker should be more likely to choose the vague/approximate one in a context that has a higher need to 'fudge' the numbers. Future work can also look at existing texts to see whether the words used are more or less vague to reverse engineer incentives, e.g., whether certain newspapers or bloggers or politicians on Twitter are more or less likely to represent numerical facts in a slanted way.

Our subtraction neglect results have immediate implications for practice. We have shown that even just saying something like "How can we improve this?" at a meeting will tend to elicit addition-related results because the word "improve" already implicitly suggests 'improving by adding something'. We therefore need to be able of the subtle biasing effect of language. This could lead to the development of new judgment and decision making studies, looking to see how different types of language attenuate or strengthen the addition bias/subtraction neglect; and this could also lead to the development of materials to improve decision-making in organizational contexts.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Environment,Financial Services, and Management Consultancy,Other

 
Description In April 2022, we hosted a successful Birmingham Artificial Intelligence Network "brum.ai" virtual event called "Communication in Data Science" which brought together a panel of industry experts (Jelena Ilic, BT; Patrice Hazam, Facebook/Meta; Gabriel Griffin-Booth, UK National Grid) to discuss issues they face with communication in data analytics, machine learning, and their respective sectors more widely. We also allowed the audience - a total of 36 data analysts from various sectors across the UK, including medical and finance -  to exchange their experiences about issues they faced in communicating data to various stakeholders. Post-event feedback showed that the event was helpful in raising participants' awareness of the importance of communication, and several respondents indicated that they were now willing to invest more into communication, and read up on data visualization and related topics. In 2021 and 2022, Dr. Bodo Winter has also been part of the University of Birmingham-led "Data Skills for Creative Industries", a data analytics bootcamp aimed at retraining people outside of the data or creative industries sectors. The Future Leader Fellowship's projects insights were used to give a workshop on communicating data. Several of the students of the first cohort (second cohort currently ongoing) have successfully gotten jobs in the industry. Some of the very early findings of the review/theory part of the first few months (prior to data collection) were incorporated into a workshop presentation at our industry partner Digital Natives (May 7, 2021). The presentation called "What cognitive science can teach us about communicating data" reviewed evidence from cognitive science - including emerging evidence from early findings - that helps improve data analysts communicate to their stake holders. The session was attended by about 30 data analysts who were either data science apprentices (about to complete their apprenticeship with Digital Natives and work in the industry) or alumni who are already data analysts working in various sectors. The ensuing discussion during the Q&A section and feedback by advisory board member Tony Moran has been used to steer the research in a direction to generate the most impact with practicing data analysts. In addition, several papers that have come out during this time that are directly related to the theme of multimodal communication that runs through the project have been featured on various news after three successful press releases. Our marketing office has kept track of where our work has been reported and the list includes more than 50 newspaper articles in more than 10 countries.
Sector Digital/Communication/Information Technologies (including Software),Other
Impact Types Cultural,Societal

 
Description Birmingham Artificial Intelligence "brum.ai" Network Event "Communication in Data Science" 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The "Communication in Data Science" was hosted as part of the UKRI FLF "Making numbers meaningful" project and attended by 36 industry experts from various sectors, all of which are either doing or supervising data analysis. After a talk by Dr. Bodo Winter on findings from the project and the vision of the project, there was an expert panel with three panelists from British Telecommunications (Jelena Ilic), Facebook/Meta (Patrice Hazam) and National Grid (Gabriel Griffin-Booth). After hearing their opinions about where in the industry communication between data scientists and decision makers goes wrong, the entire group went into break-out rooms to exchange their experiences and continue the discussion. A post-event survey indicated that attendees became much more aware of just how important communication is in data analysis, and they indicated a clear interest to learn more about effective communication, and talking to people in their respective institutions/companies about this.
Year(s) Of Engagement Activity 2022
 
Description Data Skills for Creative Industries Session on "Communicating Data" at University of Birmingham 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Our school (Dept. of English Language & Linguistics and Dept. of Modern Languages) at the University of Birmingham won a West Midlands bid to offer a "Data Skills for Creative Industries" certificate with re-employment opportunities to grow data skills in the West Midlands. We saw an opportunity to merge the impact activities of the UKRI FLF "Making numbers meaningful" project with this certificate, and made "communication in data science" part of the five-day 'bootcamp' that started the course. The attendees of this event are all postgraduate students from various universities and disciplines that are not in data who are currently unemployed, or people who are currently in a professional job but want to either up their data skills or change careers. Success is indicated by the fact that the first group indicated high satisfaction, several attendees were offered new jobs, and we were allowed to run a second year, where I will also offer a session informed by the research conducted as part of the UKRI FLF project.
Year(s) Of Engagement Activity 2021,2022
 
Description Digital Natives workshop "What cognitive science can teach us about communicating data" 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact About 30 practicing data analysts attended a workshop hosted by Digital Native where I presented on "What cognitive science can teach us about communicating data"
Year(s) Of Engagement Activity 2021
 
Description Frontiers in Science (University of Northern Colorado) talk to high schoolers about interdisciplinary research 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Talk to young high schoolers from less represented backgrounds on scholarships about interdisciplinary research and how linguistics and STEM intersect (July 6, 2021) at the Frontiers of Science Institute at the University of Northern Colorado; these students are STEM focused and the goal is to get them to think more broadly, and appreciate the role of the humanities in science, as well as how STEM fields actually intersect with the humanities
Year(s) Of Engagement Activity 2021