A Unified Model of Compositional and Distributional Semantics: Theory and Applications
Lead Research Organisation:
University of Sussex
Department Name: Sch of Engineering and Informatics
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Organisations
People |
ORCID iD |
David Weir (Principal Investigator) | |
William Keller (Co-Investigator) |
Publications
Bollegala D
(2013)
Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus
in IEEE Transactions on Knowledge and Data Engineering
Clarke D
(2015)
Fast Semantic Parsing with a Tensor Kernel
in International Journal of Computational Linguistics and Applications
Weeds J
(2014)
Learning to Distinguish Hypernyms and Co-Hyponyms
in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics
Weir D
(2016)
Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics
in Computational Linguistics
Description | We have discovered a new way to conceive of distributional composition, and developed a theoretical framework called Anchored Packed Trees (APTs) that implements this conception. We have developed a software implementation of this theory and have demonstrated that it can achieve state-of-the-art performance on several key tasks. We have demonstrated that it is possible for a machine learning model to learn to distinguish between different ontological relationships that distributional similarity measures typically conflate. We have compared the effectiveness of a wide variety of proposals regarding how to compose distributional representations of meaning. A distinctive feature of this work is that it is a so-called extrinsic evaluation, focussing on impact on practical applications rather than accuracy on artificial test sets. This has provided greater clarify as to where further research effort in this area should placed. We have devised a novel approach to distributional composition that involves higher-order dependency relations and are investigating applications of the approach in a number of NLP contexts. We have collaborated in a research initiative investigating ways in which to map distributional representations from one domain to another, and have shown that this very general approach can lead to effective cross-domain methods on a variety of tasks. |
Exploitation Route | The methods being developed in this project (and in other related projects) are showing significant potential in applied natural language processing contexts. In general, this is a result of the fact that, in a machine learning scenario, these methods make it possible to generalise from knowledge of individual word forms to knowledge of the semantics that these words denote. |
Sectors | Digital/Communication/Information Technologies (including Software) Government Democracy and Justice |
Description | Our findings are being used in several ongoing projects being undertaking with commercial collaborators, in particular, projects funded by Innovation UK (formally TSB) where we are building applied Natural Language Processing tools. The impact of this project on these applied projects concerns the way that it is enabling the creation of more robust language processing tools. For example, in one project, where we are interfacing with arge product databases, we are using distributional methods arising out of this project to create less brittle database matching algorithms. |
First Year Of Impact | 2014 |
Sector | Digital/Communication/Information Technologies (including Software) |
Impact Types | Economic |