A Unified Model of Compositional and Distributional Semantics: Theory and Applications

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

The notion of meaning is central to many areas of Computer Science, Artificial Intelligence (AI), Linguistics, Philosophy, and Cognitive Science. A formal, mathematical account of the meaning of natural language utterances is crucial to AI, since an understanding of natural language (i.e. languages such as English, German, Chinese etc)
is at the heart of much intelligent behaviour. More specifically, Natural Language Processing (NLP) --- the branch of AI concerned with the computer processing, analysis and generation of text --- requires a model of meaning for many of its tasks and applications.

There have been two main approaches to modelling the meaning of language in NLP, in order that a computer can gain some "understanding" of the text. The first, the so-called compositional approach, is based on classical ideas from Philosophy and Mathematical Logic. Using a well-known principle from the 19th century logician
Frege --- that the meaning of a phrase can be determined from the meanings of its parts and how those parts are combined --- logicians have developed formal accounts of how the meaning of a sentence can be determined from the relations of words in a sentence. This idea culminated famously in Linguistics in the work of Richard Montague in the 1970s. The compositional approach addresses a fundamental problem in Linguistics -- how it is that humans are able to generate an unlimited number of sentences using a limited vocabulary. We would like computers to have a similar capacity also.

The second, more recent, approach to modelling meaning in NLP focuses on the meanings of the words themselves. This is the so-called distributional approach to modelling word meanings and is based on the ideas of the "structural" linguists such as Firth from the 1950s. This idea is also sometimes related to Wittenstein's philosophy of "meaning as use". The idea is that the meanings of words can be determined by considering the contexts in which words appear in text. For example,
if we take a large amount of text and see which words appear close to the word "dog", and do a similar thing for the word "cat", we will see that the contexts of dog and cat tend to share many words in common (such as walk, run, furry, pet, and so on). Whereas if we see which words appear in the context of the word "television", for example, we will find less overlap with the contexts for "dog". Mathematically we represent the contexts in a vector space, so that word meanings occupy positions in a geometrical space. We would expect to find that "dog" and "cat" are much closer in the space than "dog" and "television", indicating that "dog" and "cat" are closer in meaning than "dog" and "television".

The two approaches to meaning can be roughly characterized as follows: the compositional approach is concerned with how meanings combine, but has little to say about the individual meanings of words; the distributional approach is concerned with word meanings, but has little to say about how those meanings combine. Our ambitious proposal is to exploit the strengths of the two approaches, by developing a unified model of distributional and compositional semantics. Our proposal has a central theoretical component, drawing on models of semantics from Theoretical Computer Science and Mathematical Logic. This central component which will inform, be driven by, and evaluated on tasks and applications in NLP and Information Retrieval, and also data drawn from empirical studies in Cognitive Science (the
computational study of the mind). Hence we aim to make the following fundamental contributions:

1. advance the theoretical study of meaning in Linguistics, Computer Science and Artificial Intelligence;

2. develop new meaning-sensitive approaches to NLP applications which can be robustly applied to naturally occurring text.

Planned Impact

The proposal is an ambitious scientific endeavour aiming to solve a fundamental problem which cuts across Computer Science, Linguistics, Philosophy, Artificial Intelligence and Cognitive Science. Hence providing a solution to the problem will make a substantial contribution to the Academic knowledge base in the UK (and
beyond). Beyond the Academic knowledge base, the techniques that we aim to provide for representing meaning in a computer will increase the knowledge base available for companies operating in the area of Semantic Computing and Language Technology, e.g. in the area of web search.

Language technology will become an increasingly important technology in the 21st century, allowing people to communicate naturally with computers through text, and increasingly speech, interfaces. In order for this natural communication to take place, natural language processing needs to provide sophisticated meaning representations which allow the computer to "understand" natural language, which is the focus of the proposal. Hence the proposed research has the potential to impact positively on a massive proportion of the population, in the short-to-medium term through better search engines, and in the longer term through more natural language interfaces to computers. ''Computers" here should be interpreted in a wider sense than desktop computers, to include the sorts of pervasive computing devices that will soon be commonplace, such as speech-enabled devices controlling various aspects of the home and car, for example.

The pathway to impact for the commercial collaborator, Metrica, is clear, in that the proposed research can impact directly on their sentiment analysis technology. Sentiment analysis is a rapidly growing area of language technology which aims to automatically determine the sentiment being expressed by consumers or the public about a particular product, or political party, for example. Given the massive increase in online content, and the increasing desire for people to make their views known via social networking sites, this technology is only going to become more important.

The main way in which the people skills pipeline will be influenced by the research is through the training of the researchers associated with the project. The training will involve working with 5 of the leading Computer Science and Informatics departments in the UK, and some internationally renowned researchers. The skills gained will be a broad set of research skills cutting across Computer Science, Linguistics, Philosophy, Artificial Intelligence and Cognitive Science. Such skilled researchers will be in increasing demand by Academia and Industry as the importance of language technology continues to grow.

Publications

10 25 50

publication icon
Douwe Kiela (2014) Improving Multi-Modal Representations using Image Dispersion: why less is sometimes more in Proceedings of the Short Papers of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014)

publication icon
Douwe Kiela (2014) A Systematic Study of Semantic Vector Space Model Parameters in Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)

publication icon
Grefenstette E (2014) Computing Meaning - Volume 4

publication icon
Jean Maillard (2014) A Type-Driven Tensor-Based Semantics for CCG in Proceedings of the EACL 2014 Type Theory and Natural Language Semantics Workshop (TTNLS)

publication icon
Kiela D (2017) Learning Neural Audio Embeddings for Grounding Semantics in Auditory Perception in Journal of Artificial Intelligence Research

publication icon
Kiela, D. (2013) Detecting Compositionality of Multi-Word Expressions using Nearest Neighbours in Vector Space Models in Proceedings of the Short Papers of the Conference on Empirical Methods in Natural Language Processing (EMNLP-13)

 
Description Natural Language Processing applications, for example machine translation and semantic search, require models of what words and sentences mean. Explicitly coding such models by hand has proven very difficult. A more promising approach is for a computer to learn about language by analyzing large bodies of text automatically, and also analyzing other data sources such as images. However, language has the property that words and phrases can be combined to create new meanings (of phrases and sentences). In this project we have developed techniques for a computer to a) learn the meanings of words and phrases; and b) learn how to put them together.
Exploitation Route Our findings can be applied to current machine learning approaches to natural language processing, especially "deep learning" approaches, which are currently enjoying many successes but are still linguistically ill-informed.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software)

URL https://sites.google.com/site/discoprojectuk/about
 
Description A Mathematical Framework for a Compositional Distributional Model of Meaning 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited talk at Stanford University as part of a symposium on vector space semantics.

http://sesquipedalian.stanford.edu/?p=13891

n/a
Year(s) Of Engagement Activity 2013
 
Description A Mathematical Framework for a Distributional Compositional Model of Meaning 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited talk at the University of Groningen, the Netherlands.

n/a
Year(s) Of Engagement Activity 2013
 
Description A Mathematical Framework for a Distributional Compositional Model of Meaning - II 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited seminar talk at Kings College London.

n/a
Year(s) Of Engagement Activity 2012
 
Description Compositional and Distributional Models of Meaning for Natural Language 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited departmental seminar at the University of Sheffield Department of Computer Science.

n/a
Year(s) Of Engagement Activity 2012
 
Description Compositional and Distributional Models of Meaning for Natural Language - II 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited seminar at the University of Ulster.

N/A
Year(s) Of Engagement Activity 2013
 
Description Helping Computers Overcome Natural Language Ambiguity 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact An article for the University of Cambridge advertising its university-wide Language Sciences Initiative

n/a
Year(s) Of Engagement Activity 2014
 
Description Our ambiguous world of words 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Article for the Research web pages of the University of Cambridge describing the project.

http://www.cam.ac.uk/research/features/our-ambiguous-world-of-words

n/a
Year(s) Of Engagement Activity 2013