Transfer Learning for Frame-based Activity Recognition

Lead Research Organisation: University of Bristol
Department Name: Computer Science

Abstract

Activity recognition is an important task for home surveillance to monitor the wellbeing of children, pets and elderly as well as for security purposes. Despite an increasing number of video monitors used in households little provide smart monitoring. Therefore, this research will be working towards visual activity recognition to detect and classify actions that can be used to determine the health of an individual.
Most research towards activity recognition has focused on recognising a single high-level action in a video (e.g. playing football or making a sandwich) however in the context of home surveillance frame-based activity recognition provides more meaningful information that provide low-level actions (e.g. put down plate or pick up mug) for each frame as soon as the action occurs.
State of the art methods for activity recognition incorporate Convolutional (CNN) and/or Recurrent Neural Networks (RNN). With these methods, incorporating temporal information across a video greatly improves the accuracy of recognition. Popular approaches to activity recognition include extracting features from both RGB and Optical Flow frames using CNNs used for classification, or to train Long Short Term Memory units (LSTM) on the RGB features extracted by CNNs. These techniques have shown varying success across datasets, particularly for frame based action recognition that provide only a small benefit compared to hand crafted features. This lack of success is partly due to the lack of available training data due to the difficulties in collecting and annotating datasets for actions.
Transfer learning has shown to improve the accuracy of object recognition where the specific environment to test the models on has little training data. By pre-training Neural Networks on larger datasets of objects the models can learn features that are also relevant to the test environment before fine-tuning the model on test environment with little training data available. Transfer learning in the temporal domain, shown to be important for action recognition, has had little research.
This work will focus on designing models to improve the accuracy of frame-based activity recognition of low level actions that transfer well to different environments that lack large amounts of training data.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509619/1 01/10/2016 30/09/2021
1941917 Studentship EP/N509619/1 18/09/2017 31/03/2021 Jonathan Munro