Multi-modal content similarity for predicting audience behaviour

Lead Research Organisation: University College London
Department Name: Computer Science

Abstract

Recent multi-modal approaches to Natural Language Processing (NLP) have shown that combining the information expressed by different modalities, such as images with their captions, a word with its typical sound or the way it is pronounced, the audio of a musical piece and its description, even information extracted from excited brain areas, improves NLP tasks of classification, similarity, and inference. This is taken as an indication that considering multi-modal data provides a better way of understanding content and the goal of this project is to advance the theory and applications in this area. The BBC news, drama and factual programmes are an excellent source of data and a great case study for this project they come with metadata (e.g. genre, format, service) and multiple modalities of data (e.g. audio, video, text). The general research methodology would be to learn vector representations for the multiple modes separately and/or via joint objective functions using machine learning algorithms such as the classical k-means and neural networks. For training and as an application domain, we suggest tasks that approximate viewers' behaviour with content, such as programme popularity and content recommendations. We will work within different socio-demographic groups and the viewers behaviour data of these groups. The academic supervisor will be Mehrnoosh Sadrzadeh, the institution is the CS Department of UCL. The BBC supervisors are Chris Newell and Andrew McParland. Sadrzadeh has worked on the preliminary ideas that led to this project via two Royal Academy of Engineering Industrial Scheme Fellowships with the BBC R&D (Jan 2017-2018, Sept 2019-2020). The BBC R&D has hired two interns to work on small scale explorations of some aspects of this project (combining audio, genre, subtitles in a dataset of 145 drama programmes). The preliminary findings have been promising, leading to improvements in precision and diversity of metadata-based recommendations of the kind frequently used by the BBC.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/V519625/1 01/10/2020 30/09/2026
2481865 Studentship EP/V519625/1 11/01/2021 27/09/2024 Taner Cagali