Transfer Learning for Frame-based Activity Recognition

Lead Research Organisation: University of Bristol
Department Name: Computer Science

Abstract

Activity recognition is an important task for home surveillance to monitor the wellbeing of children, pets and elderly as well as for security purposes. Despite an increasing number of video monitors used in households little provide smart monitoring. Therefore, this research will be working towards visual activity recognition to detect and classify actions that can be used to determine the health of an individual.
Most research towards activity recognition has focused on recognising a single high-level action in a video (e.g. playing football or making a sandwich) however in the context of home surveillance frame-based activity recognition provides more meaningful information that provide low-level actions (e.g. put down plate or pick up mug) for each frame as soon as the action occurs.
State of the art methods for activity recognition incorporate Convolutional (CNN) and/or Recurrent Neural Networks (RNN). With these methods, incorporating temporal information across a video greatly improves the accuracy of recognition. Popular approaches to activity recognition include extracting features from both RGB and Optical Flow frames using CNNs used for classification, or to train Long Short Term Memory units (LSTM) on the RGB features extracted by CNNs. These techniques have shown varying success across datasets, particularly for frame based action recognition that provide only a small benefit compared to hand crafted features. This lack of success is partly due to the lack of available training data due to the difficulties in collecting and annotating datasets for actions.
Transfer learning has shown to improve the accuracy of object recognition where the specific environment to test the models on has little training data. By pre-training Neural Networks on larger datasets of objects the models can learn features that are also relevant to the test environment before fine-tuning the model on test environment with little training data available. Transfer learning in the temporal domain, shown to be important for action recognition, has had little research.
This work will focus on designing models to improve the accuracy of frame-based activity recognition of low level actions that transfer well to different environments that lack large amounts of training data.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509619/1 01/10/2016 30/09/2021
1941917 Studentship EP/N509619/1 01/10/2017 31/07/2021 Jonathan Munro
 
Description Mathematical models of fine-grained actions and interactions such as "cutting a tomato"or "tightening a bolt" have a wide range of applications in assistive technologies in homes as well as in industry. Currently, deploying such models in new, unseen environments perform poorly as the model has over-fitted to it's training environment. This work has shown that with only unlabelled data in a target environment, which is cheap and easy to collect, models can be adapted to perform well in the deployed environment .
Exploitation Route Academia and industry may use the methods in the publications to improve fine-grained action recognition in target environments. Researchers may be inspired by this work to improve domain adpatation works for fine-grained action recognition.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Research and software made in collaboration with Naver Labs Europe.
First Year Of Impact 2020
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title EPIC-KITCHENS-100 
Description Extended Footage for EPIC-KITCHENS dataset, to 100 hours of footage. For automatic annotations, see separate dataset at: https://doi.org/10.5523/bris.3l8eci2oqgst92n14w2yqi5ytu 10/09/2020 **N.b. please also see ERRATUM published at https://github.com/epic-kitchens/epic-kitchens-100-annotations/blob/master/README.md#erratum** 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact This provided a substantial extention to the EPIC-KITCHENS dataset with data collected 2 years later. We introduced 6 new challenges to the research community for the dataset to advance video understanding. 
URL https://data.bris.ac.uk/data/dataset/2g1n6qdydwa9u22shpxqzp0t8m/
 
Title EPIC-Kitchens 
Description Largest dataset in first-person vision, fully annotated with open challenges for object detection, action recognition and action anticipation 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact Open challenges with 15 different universities and research centres competing for winning the relevant challenges. 
URL http://epic-kitchens.github.io
 
Description Domain Adaptation for Action Retrieval 
Organisation NAVER LABS Europe
Country France 
Sector Public 
PI Contribution Research collaboration including regular meetings and scheduled internship for April 2020. Sceduled intership was remote due to COVID-19.
Collaborator Contribution Funded internship
Impact -
Start Year 2020
 
Description EPIC-Kitchens Dataset Collection 
Organisation University of Catania
Country Italy 
Sector Academic/University 
PI Contribution Collaboration to collect the largest cross-location dataset of egocentric non-scripted daily activities
Collaborator Contribution Effort time of partners (Dr Sanja Fidler and Dr Giovanni Maria Farinella) in addition to time of your research team members (Dr Antonino Furnari and Mr David Acuna)
Impact ECCV 2018 publication, TPAMI publication under review
Start Year 2017
 
Description EPIC-Kitchens Dataset Collection 
Organisation University of Toronto
Country Canada 
Sector Academic/University 
PI Contribution Collaboration to collect the largest cross-location dataset of egocentric non-scripted daily activities
Collaborator Contribution Effort time of partners (Dr Sanja Fidler and Dr Giovanni Maria Farinella) in addition to time of your research team members (Dr Antonino Furnari and Mr David Acuna)
Impact ECCV 2018 publication, TPAMI publication under review
Start Year 2017
 
Title Code to reproduce results for the Multi-modal Domain Adaptation for Fine-grained Action Recognition 
Description This contains python code to replicate the results for the publication: Multi-modal Doman Adaptation for Fine-grained Action Recognition. 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
Impact This code will allow users to adapt fine-grained action recongition to new unlablled domains. 
URL https://github.com/jonmun/MM-SADA-code
 
Description Oral Presentation for CVPR 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Oral presentation of my publication Multi-modal Domain Adaptation for Fine-grained Action Recognition to the research community who attended CVPR 2020.
Year(s) Of Engagement Activity 2020
URL http://cvpr2020.thecvf.com/
 
Description PAISS, Artificial Intelligence Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact University students and industry met for a summer school with talks from leading academics in University and Industry.
Year(s) Of Engagement Activity 2018
 
Description Poster at BMVA Symposium on Video Understanding 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Poster presentation to academics in University and industry.
Year(s) Of Engagement Activity 2019
URL https://dimadamen.github.io/bmva_symposium_2019/#cfp
 
Description Poster at ICCV 2019 Workshop on Multi-modal Video Analysis and Moments in Time Challenge 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presented a poster presentation to a number of academics from University and industry.
Year(s) Of Engagement Activity 2019
URL https://sites.google.com/view/multimodalvideo/home