Exploitation of Diverse Data via Automatic Adaptation of Knowledge Extraction Software

Lead Participant: LINGUAMATICS LIMITED

Abstract

The current generation of language processing has had considerable success in extracting useful information from large amounts of unstructured text, whether this is research literature or social media. However, adapting to a new domain is often a laborious process, with respect both to diverse types of data (e.g. newswire vs. patent literature) and to the terminology used in a given domain (e.g. in medical practice vs. pharmaceutical research). Humans can perform these tasks on small data sets, but face a challenge in the face of massively increasing amounts of electronic text. The EVOKES project is exploiting distributional similarity techniques to accelerate key components of customisation - the recognition of concepts, and the creation or adaptation of terminologies that link terms to concepts.

Lead Participant

Project Cost

Grant Offer

LINGUAMATICS LIMITED £148,080 £ 51,585
 

Participant

RUNTIME COLLECTIVE LIMITED £102,521 £ 35,685
UNIVERSITY OF SUSSEX
UNIVERSITY OF SUSSEX (THE) £77,588 £ 77,588

Publications

10 25 50