A Unified Model of Compositional and Distributional Semantics: Theory and Applications

Lead Research Organisation: University of York
Department Name: Computer Science

Abstract

The notion of meaning is central to many areas of Computer Science, Artificial Intelligence (AI), Linguistics, Philosophy, and Cognitive Science. A formal, mathematical account of the meaning of natural language utterances is crucial to AI, since an understanding of natural language (i.e. languages such as English, German, Chinese etc)
is at the heart of much intelligent behaviour. More specifically, Natural Language Processing (NLP) --- the branch of AI concerned with the computer processing, analysis and generation of text --- requires a model of meaning for many of its tasks and applications.

There have been two main approaches to modelling the meaning of language in NLP, in order that a computer can gain some "understanding" of the text. The first, the so-called compositional approach, is based on classical ideas from Philosophy and Mathematical Logic. Using a well-known principle from the 19th century logician
Frege --- that the meaning of a phrase can be determined from the meanings of its parts and how those parts are combined --- logicians have developed formal accounts of how the meaning of a sentence can be determined from the relations of words in a sentence. This idea culminated famously in Linguistics in the work of Richard Montague in the 1970s. The compositional approach addresses a fundamental problem in Linguistics -- how it is that humans are able to generate an unlimited number of sentences using a limited vocabulary. We would like computers to have a similar capacity also.

The second, more recent, approach to modelling meaning in NLP focuses on the meanings of the words themselves. This is the so-called distributional approach to modelling word meanings and is based on the ideas of the "structural" linguists such as Firth from the 1950s. This idea is also sometimes related to Wittenstein's philosophy of "meaning as use". The idea is that the meanings of words can be determined by considering the contexts in which words appear in text. For example,
if we take a large amount of text and see which words appear close to the word "dog", and do a similar thing for the word "cat", we will see that the contexts of dog and cat tend to share many words in common (such as walk, run, furry, pet, and so on). Whereas if we see which words appear in the context of the word "television", for example, we will find less overlap with the contexts for "dog". Mathematically we represent the contexts in a vector space, so that word meanings occupy positions in a geometrical space. We would expect to find that "dog" and "cat" are much closer in the space than "dog" and "television", indicating that "dog" and "cat" are closer in meaning than "dog" and "television".

The two approaches to meaning can be roughly characterized as follows: the compositional approach is concerned with how meanings combine, but has little to say about the individual meanings of words; the distributional approach is concerned with word meanings, but has little to say about how those meanings combine. Our ambitious proposal is to exploit the strengths of the two approaches, by developing a unified model of distributional and compositional semantics. Our proposal has a central theoretical component, drawing on models of semantics from Theoretical Computer Science and Mathematical Logic. This central component which will inform, be driven by, and evaluated on tasks and applications in NLP and Information Retrieval, and also data drawn from empirical studies in Cognitive Science (the
computational study of the mind). Hence we aim to make the following fundamental contributions:

1. advance the theoretical study of meaning in Linguistics, Computer Science and Artificial Intelligence;

2. develop new meaning-sensitive approaches to NLP applications which can be robustly applied to naturally occurring text.
 
Description The mathematics behind models of language, logic, quantum physics, and computation have a common core. Transferring tools from one field to another allows us to extend models of meaning beyond simple noun phrases, to include words with logical or structural meanings. The application of tools from theoretical and quantum computing to models of language, developed within the project, has resulted in drastic simplification of the complexity of using these models. These simplifications also have useful applications in other fields of computer science and mathematics.
Exploitation Route The findings provide the mathematical and logical machinery for researchers and programmers seeking more comprehensive tools to analyze natural language, and provide a route to making such tools significantly more efficient and less computationally costly. They also demonstrate that other proposed approaches cannot be fruitful for fundamental structural reasons.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://arxiv.org/abs/1303.3170
 
Description Knowledge Transfer Partnership
Amount £470,000 (GBP)
Funding ID KTP010852 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 05/2018 
End 04/2021
 
Description FET Open EU Grant Proposal 
Organisation National University of Distance Education
Country Spain 
Sector Academic/University 
PI Contribution We have intiated joint research with UNED to continue the work undertaken in the current project. We have submitted an FET Open grant proposal which has been possible directly as a result of the current project.
Collaborator Contribution They have contributed to the writing of the FET Open grant proposal
Impact None yet
Start Year 2014
 
Description FET Open EU Grant Proposal 
Organisation University of the Basque Country
Country Spain 
Sector Academic/University 
PI Contribution We have intiated joint research with UNED to continue the work undertaken in the current project. We have submitted an FET Open grant proposal which has been possible directly as a result of the current project.
Collaborator Contribution They have contributed to the writing of the FET Open grant proposal
Impact None yet
Start Year 2014
 
Title Dependency based embeddings 
Description We provide dependency based word embeddings and related code for a skipgram variant word embedding that utilizes additional information from dependency graphs. This can be employed in a broad range of natural language processing applications such sentiment analysis, question answering and information retrieval. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact We also provide experimental results that show that dependency based embeddings can outperform standard window based embeddings in many natural language processing tasks. Specifically, for three different classification methods: a Support Vector Machine, a Convolutional Neural Network and a Long Short Term Memory Network; the use of our dependency based embeddings improve on question classification, sentiment analysis and relation classification tasks. 
URL http://www.cs.york.ac.uk/nlp/extvec