Charting lexical development through dense coding and analysis of word senses

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Philosophy Psychology & Language

Abstract

Children learn words at a remarkable rate, which is all the more striking because most words are highly ambiguous. Specifically, most words are "polysemous", which means that they can be used in a range of distinct but related senses. For instance, think of all the different ways that the word "run" can be used: You can run a race, run a car, run a company, run some tests, or even run a bath. These flexible uses of words provide us with a vivid expressive power -- we can use a word with one meaning while alluding to all the rest -- but most researchers have tended to assume that polysemy also makes words hard to learn. In particular, when a word's meaning changes across uses, then how are children ever supposed to acquire it?

We have argued that this assumption is backwards, and that flexible uses of words actually help children to acquire a vocabulary. That is because the senses of words are related to one another (e.g., there is an obvious link between the senses of run in "run a car" versus "run a company"), and so once a child has learned one sense of a word, then they can use that knowledge as a supportive platform to more easily infer new senses. By contrast, if all words were unambiguous, or all words had unrelated ambiguous meanings, then each new meaning would have to be learned without any support. Thus, while polysemy may seem confusing from the outside, it is actually an opportunity for learning, when seen from the perspective of a child.

This project tests three implications of this idea. First, for polysemy to help learning, then we expect that it should be plentiful in the input that children receive. However, we do not know how much polysemy children hear. Indeed, it is possible that caregivers implicitly avoid polysemy, particularly with younger children, if they tend to say simple and unambiguous words.

Second, if polysemy indeed helps learning, then we should expect that children also use polysemy, even in the first words that they say. Finally, if polysemy helps learning, then we should expect that it will be easier for children to learn new senses for words, than to learn entirely new words.

To test these implications, we need a database of how caregivers and children use polysemous words, that specifies which senses are used when. However, while there are existing databases of how caregivers and children interact, they never specify the different senses in which words are used. For instance, when "run" is used, the corpora never explicitly specify with which meaning. Hence, quantitative analyses of polysemy are currently impossible.

Thus, the first aim of this research is to recode these databases of child language in terms of word senses. To do this, we will use a new toolkit that our group has developed, and we will apply it to both English and French data. The output will be (to our knowledge) the largest sense-annotated database yet constructed, not only in terms of child language, but in terms of human conversations more broadly.

Using this database, we will then examine how caregivers use polysemy when talking to their children, and how children use polysemy in their earliest language use. We will build novel statistical models that assess questions such as whether parents avoid using polysemy when talking to their youngest children, and whether children find it easier to either use new senses for old words, or learn entirely new words altogether.

These data and analyses should provide the clearest picture to date of how children acquire polysemy, with implications for theories of language development, theories from linguistics, and for our understanding of how caregivers interact with their children. We will make our tools and databases publicly available, and describe our results in scholarly journals.

Publications

10 25 50