Charting lexical development through dense coding and analysis of word senses

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Philosophy Psychology & Language

Abstract

Children learn words at a remarkable rate, which is all the more striking because most words are highly ambiguous. Specifically, most words are "polysemous", which means that they can be used in a range of distinct but related senses. For instance, think of all the different ways that the word "run" can be used: You can run a race, run a car, run a company, run some tests, or even run a bath. These flexible uses of words provide us with a vivid expressive power -- we can use a word with one meaning while alluding to all the rest -- but most researchers have tended to assume that polysemy also makes words hard to learn. In particular, when a word's meaning changes across uses, then how are children ever supposed to acquire it?

We have argued that this assumption is backwards, and that flexible uses of words actually help children to acquire a vocabulary. That is because the senses of words are related to one another (e.g., there is an obvious link between the senses of run in "run a car" versus "run a company"), and so once a child has learned one sense of a word, then they can use that knowledge as a supportive platform to more easily infer new senses. By contrast, if all words were unambiguous, or all words had unrelated ambiguous meanings, then each new meaning would have to be learned without any support. Thus, while polysemy may seem confusing from the outside, it is actually an opportunity for learning, when seen from the perspective of a child.

This project tests three implications of this idea. First, for polysemy to help learning, then we expect that it should be plentiful in the input that children receive. However, we do not know how much polysemy children hear. Indeed, it is possible that caregivers implicitly avoid polysemy, particularly with younger children, if they tend to say simple and unambiguous words.

Second, if polysemy indeed helps learning, then we should expect that children also use polysemy, even in the first words that they say. Finally, if polysemy helps learning, then we should expect that it will be easier for children to learn new senses for words, than to learn entirely new words.

To test these implications, we need a database of how caregivers and children use polysemous words, that specifies which senses are used when. However, while there are existing databases of how caregivers and children interact, they never specify the different senses in which words are used. For instance, when "run" is used, the corpora never explicitly specify with which meaning. Hence, quantitative analyses of polysemy are currently impossible.

Thus, the first aim of this research is to recode these databases of child language in terms of word senses. To do this, we will use a new toolkit that our group has developed, and we will apply it to both English and French data. The output will be (to our knowledge) the largest sense-annotated database yet constructed, not only in terms of child language, but in terms of human conversations more broadly.

Using this database, we will then examine how caregivers use polysemy when talking to their children, and how children use polysemy in their earliest language use. We will build novel statistical models that assess questions such as whether parents avoid using polysemy when talking to their youngest children, and whether children find it easier to either use new senses for old words, or learn entirely new words altogether.

These data and analyses should provide the clearest picture to date of how children acquire polysemy, with implications for theories of language development, theories from linguistics, and for our understanding of how caregivers interact with their children. We will make our tools and databases publicly available, and describe our results in scholarly journals.

Funded Value:

£241,194

Funded Period:

Aug 21 - Feb 24

Funder:

ESRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

ES/V012878/1

Principal Investigator:

Hugh Rabagliati

Research Subject:

Linguistics (27%)

Psychology (72%)

Research Topic:

Cognitive Psychology (36%)

Developmental Psychology (18%)

Language Acquisition (18%)

Lexicon (9%)

Psychology (18%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
Hugh Rabagliati (Principal Investigator)
Barbora Skarabela (Researcher)

Publications

Author Name

Title Publication Date Published

10 25 50

Brough J (2024) Cognitive causes of 'like me' race and gender biases in human language production. in Nature human behaviour

Skarabela B (2023) Learning Dimensions of Meaning: Children's Acquisition of But

Skarabela B (2023) Learning dimensions of meaning: Children's acquisition of but. in Cognitive psychology

Key Findings
Impact Summary
Further Funding
Engagement Activities


Description	In this project, we annotated the meanings of words that children hear and use. Prior investigation of how children learn words has focused on what forms (i.e., sounds) they use, rather than investigating the meanings of these words. Our achievements and discoveries through this project include: 1. We created by far the largest human-checked meaning-annotated corpus of conversations, particularly for conversations involving children. In total we annotated the meanings of more than one million words. This served as a base for our investigations of how children learn and use meanings, and will be made available soon to other researchers in cognitive and computational language sciences. 2. We made a number of discoveries about how children learn word senses. For instance, we found that children use a broad array of senses from early in life, but still use words in less ambiguous ways than adults. 3. We found that parents do not simplify the meanings of the words that they use with children. 4. Contra to prior theorising, we found that children rarely over-extend the meanings of words (i.e., using words with unusual meanings), and that they do so at about the same rate as adults. 5. We annotated how certain abstract words (connectives like "but") are used in both speech to children and literature for children, finding that their meanings are very distinct in these two contexts. This may explain why children often struggle to master connectives in school.
Exploitation Route	Our database of annotated meanings will be useful both for other researchers and for those in industry who are generating linguistic tools for use by children. Our comparison of meanings in speech vs literature will be useful for researcher, but also for educators who are trying to understand why their young students may struggle to master meanings that they take for granted.
Sectors	Digital/Communication/Information Technologies (including Software) Education Other


Description	The non-academic impacts of this award have been focused on early years work. To boost the impact of this award, the PDRA was able to gather an ESRC Impact Accelerator grant, which was used to host events for practitioners, children and families designed to highlight early years language, and to develop a new app designed to highlight how parents can facilitate early language learning. The practitioner events (e.g., presentations to the Scottish Book Trust) focused on how talking to children is known to enhance their language development. The app is currently in alpha testing and thus not yet contributing societal impact, but we hope it to be released soon.
First Year Of Impact	2023
Impact Types	Cultural


Description	The Power of Words: The importance of boosting children's vocabulary size in the preschool years
Amount	£35,000 (GBP)
Organisation	Economic and Social Research Council
Sector	Public
Country	United Kingdom
Start	06/2022
End	03/2023


Description	The Power of Words: Words of Music.
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Other audiences
Results and Impact	Day long workshops for parents and caregivers at the St Cecilia's music hall, highlighting how words and metaphors impact our understanding of music, and doing collaborative early years Bookbug sessions on the importance of language and early reading.
Year(s) Of Engagement Activity	2022

Abstract

Organisations

People

ORCID iD

Publications