Semantic Information Pursuit for Multimodal Data Analysis

Lead Research Organisation: University of Oxford
Department Name: Statistics


In 1948, Shannon published his famous paper "A Mathematical Theory of Communication" [88], which laid the foundations of information theory and led to a revolution in communication technologies. Shannon's fundamental contribution was to provide a precise way by which information could be represented,
quantified and transmitted. Critical to Shannon's ideas was the notion that the content of a message is irrelevant to its transmission, since any signal can be represented in terms of bits.

However, Shannon's theory has some limitations. In 1953, Weaver argued that there are three levels
of communication problems: the technical problem "How accurately can the symbols of
communication be transmitted?", the semantic problem "How precisely do the transmitted symbols
convey the desired meaning?", and the effectiveness problem "How effectively does the received
meaning affect conduct in the desired way?" Hence, a key limitation of Shannon's theory is that it
is limited to the technical problem.

This was also pointed out by Bar-Hillel and Carnap in 1953, who argued that "The Mathematical Theory of Communication, often referred to also as Theory (of Transmission) of Information, as practised nowadays, is not interested in the content of the symbols whose information it measures. The measures, as defined, for instance, by Shannon, have nothing to do with what these symbols symbolise, but only with the frequency of their occurrence." While Bar-Hillel and Carnap argued that "the fundamental concepts of the theory of semantic information can be defined in a straightforward way on the basis of the theory of inductive probability", their work was based primarily on logic rules that were applicable to a very restricted class of
signals (e.g. text). In the last 60 years there has been extraordinary progress in information theory,
signal, image and video processing, statistics, machine learning and optimization, which have led
to dramatic improvements in speech recognition, machine translation, and computer vision technologies.
However, the fundamental question of how to represent, quantify and transmit semantic is what this programme of research shall address.


10 25 50