Deep Spatiotemporal Models for Video Representation Learning

Lead Research Organisation: University of Warwick
Department Name: Mathematics

Abstract

The area of video representation learning is of interest to the Artificial Intelligence community, as it aims to foster the field of computer vision by using the dynamics of a scene to make critical decisions. Compared to using still images, video data gives more context to the information present, and this improves the quality of decisions made by systems built on the framework of video representation learning. The research aims to add to the field of video representation learning by developing a graph-based machine learning model that would outperform current state-of-the-art models. Although video data do not have a naturally occurring graph structure (unlike social networks), using a graph-based architecture significantly reduces the number of parameters needed in the model [1]. And this makes the proposed model suitable for memory-constrained devices like mobile phones.
[1] Shirian, A., Tripathi, S., & Guha, T. (2021). Dynamic Emotion Modeling with Learnable Graphs and Graph Inception Network. IEEE Transactions on Multimedia.

The aims and objectives of the research *
The objective of the research is to:
1. develop spatiotemporal graphs for modeling videos;
2. learn the adjacency such that it is data dependent; and
3. extend to heterogeneous graphs where data modalities can be multiple (e.g.,
video with audio).

The novelty of the research methodology (if any) *
The research would contribute to the existing body of knowledge in video representation learning. The research aims to design a new graph-based machine learning architecture, new loss functions to better penalize our model and optimization techniques to speed up model training.

The potential impact, applications, and benefits *
The research would:
1. improve the current autonomous navigation systems;
2. be applied for object detection and action prediction in surveillance systems;
3. be used to improve computer vision in robots;
4. be suitable for memory-constrained devices like mobile phones;
5. be able to narrate events happening in a scene. And this is useful for visually
impaired individuals etc.
How the research relates to the remit *

The research cuts across the field of Artificial Intelligence and robotics, mathematical science, and Information and communication technologies (ICT). And these are key areas of interest for the EPSRC, as this research would
improve visual perception in robotics, broaden the knowledge on the applicability of graph theory beyond social networks and the future of self driving cars would not be far from reach.

Research Category; ICT [Information and Communication Technologies], Mathematical Sciences

External Partner - Intel Labs, San Diego.

Planned Impact

In the 2018 Government Office for Science report, 'Computational Modelling: Technological Futures', Greg Clarke, the Secretary of State for Business Energy and Industrial Strategy, wrote "Computational modelling is essential to our future productivity and competitiveness, for businesses of all sizes and across all sectors of the economy". With its focus on computational models, the mathematics that underpin them, and their integration with complex data, the MathSys II CDT will generate diverse impacts beyond academia. This includes impacts on skills, on the economy, on policy and on society.

Impacts on skills.
MathSys II will produce a minimum of 50 PhD graduates to support the growing national demand for advanced mathematical modelling and data analysis skills. The CDT will provide each of them with broad core skills in the MSc, a deep knowledge of their chosen research specialisation in the PhD and a complementary qualification in transferable skills integrated throughout. Graduates will thus acquire the profiles needed to form the next generation of leaders in business, government and academia. They will be supported by an integrated pastoral support framework, including a diverse group of accessible leadership role models. The cohort based environment of the CDT provides a multiplier effect by encouraging cohorts to forge long-lasting professional networks whose value and influence will long outlast the CDT itself. MathSys II will seek to maximise the influence of these networks by providing topical training in Responsible Research and Innovation, by maintaining a robust Equality, Diversity & Inclusion policy, and by integration with Warwick's global network of international partnerships.

Economic impacts.
The research outputs from many MathSys II PhD projects will be of direct economic value to commercial, public sector and charitable external partners. Engagement with CDT partners will facilitate these impacts. This includes co-supervision of PhD and MSc projects, co-creation of Research Study Groups, and a strong commitment to provide placements/internships for CDT students. When commercial innovations or IP are generated, we will work with Warwick Ventures, the commercial arm of the University of Warwick, to commercialise/license IP where appropriate. Economic impact may also come from the creation of new companies by CDT graduates. MathSys II will present entrepreneurship as a viable career option to students. One external partner, Spectra Analytics, was founded by graduates of the preceding Complexity Science CDT, thus providing accessible role models. We will also provide in-house entrepreneurship training via Warwick Ventures and host events by external start-up accelerator Entrepreneur First.

Impacts on policy.
The CDT will influence policy at the national and international level by working with external partners operating in policy. UK examples include Department of Health, Public Health England and DEFRA. International examples include World Health Organisation (WHO) and the European Commission for the Control of Foot-and-mouth Disease (EuFMD). MathSys students will also utilise the recently announced UKRI policy internships scheme.

Impacts on society.
Public engagement will allow CDT students to promote the value of their research to society at large. Aside from social media, suitable local events include DataBeers, Cafe Scientifique, and the Big Bang Fair. MathSys will also promote a socially-oriented ethos of technology for the common good. Concretely, this includes the creation of open-source software, integration of software and data carpentry into our computational and data driven research training and championing open-access to research. We will also contribute to the 'innovation culture and science' strand of Coventry's 2021 City of Culture programme.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S022244/1 01/10/2019 31/03/2028
2431426 Studentship EP/S022244/1 01/10/2020 30/09/2024 Olayinka Ajayi