Deep Learning from Crawled Spatio-Temporal Representations of Video (DECSTER)
Lead Research Organisation:
Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
People |
ORCID iD |
Ioannis Patras (Principal Investigator) |
Publications
Apostolidis E
(2021)
AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization
in IEEE Transactions on Circuits and Systems for Video Technology
Apostolidis E
(2021)
Video Summarization Using Deep Neural Networks: A Survey
in Proceedings of the IEEE
Apostolidis E
(2021)
Combining Adversarial and Reinforcement Learning for Video Thumbnail Selection
Apostolidis E
(2021)
Video Summarization Using Deep Neural Networks: A Survey
Apostolidis E
(2020)
Performance over Random
Apostolidis Evlampios
(2022)
Combining Global and Local Attention with Positional Encoding for Video Summarization
Apostolidis Evlampios
(2022)
Combining Global and Local Attention with Positional Encoding for Video Summarization
Description | The work has been focusing on Deep Learning methods for action recognition and action localisation. We have focused in particular on fine-grained recognition and have developed baselines for action localisation as outlined in the original project description. A key finding underlying all related publications is that fine grained temporal analysis, i.e., at increased temporal resolutions, is important for increased performance. We have developed methods for action recognition and action retrieval that rely mechanisms for feature extraction at high temporal resolution, and mechanisms for temporal alignment for estimating similarity/distances between videos. We have shown that this leads to increased performance in comparison to crude video-based representations. This has been extended for fine-grained (temporal) localisation of actions in long, untrimmed image sequences. A second key-finding is that, by using a framework called knowledge distillation, in which Networks are used to train each other, it is possible to achieve different tradeoffs of accuracy, speed and storage requirements. In a parallel direction our work on video summarisation have shown the limitations of the current evaluation protocols and how variations of deep learning methods keep can improving the state-of-the-art. |
Exploitation Route | We have developed methods for video recognition, action localisation and video summarisation that are published. We also provide code and datasets that are also in the public domain. Those can be used by others to benchmark their methods, to train their models and to improve on the methods that we have developed. |
Sectors | Creative Economy Healthcare Culture Heritage Museums and Collections |
Description | We have developed methods that have been used widely by researchers in the field and have provided code, models and the data that we have used. In addition, with the collaboration with CERTH-ITI, have made publicly available a dataset that has been widely used in the field. |
First Year Of Impact | 2020 |
Sector | Other |
Description | AI4Media |
Amount | € 12,000,000 (EUR) |
Funding ID | 951911 |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 08/2019 |
End | 09/2023 |
Title | ViSiL code and models |
Description | This repository contains the Tensorflow implementation of the paper "ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning", ICCV, 2019, a method for video retrieval. It provides code for the calculation of similarities between the query and database videos given by the user. Also, it contains an evaluation script to reproduce the results of the paper. The video similarity calculation is achieved by applying a frame-to-frame function that respects the spatial within-frame structure of videos and a learned video-to-video similarity function that also considers the temporal structure of videos. |
Type Of Material | Computer model/algorithm |
Year Produced | 2020 |
Provided To Others? | Yes |
Impact | ViSiL has drawn attention since it was made publicly available, with 25 forks (people/groups that started building upon it) and 114 github stars - Also, various researchers contributed with "pull requests", for porting the framework into PyTorch, for instance. |
URL | https://github.com/MKLab-ITI/visil |
Description | Collaboration with Institute on Telematics and Informatics |
Organisation | Centre for Research and Technology Hellas (CERTH) |
Country | Greece |
Sector | Academic/University |
PI Contribution | QMUL has a long standing collaboration with the Institute of Telematics and Informatics, Centre for Research and Technology Hellas (CERTH-ITI). During the period of the Decster project Georgios Kordopatis-Zilos has been performing research on video retrieval under the supervision of Ioannis Patras. The work has been aligned with the aims of the DECSTER project so as to address action recognition, and more specifically action retrieval - the work has resulted in two publications in selective Computer Vision and Multimedia Analysis venues. |
Collaborator Contribution | Informatics and Telematics Institute (ITI-CERTH, Greece) funded the salaries and paid the fees of the researcher, provides equipment and travel costs and co-supervision of the research. |
Impact | The collaboration is long standing -- since 2009. Within the Decster project in the period until 01-2020 there have been two publications aligned with the goals of the project. |
Start Year | 2018 |
Title | Few-Shot Action Localization without Knowing Boundaries |
Description | The repository contains the implementation of "Few-Shot Action Localization without Knowing Boundaries" (2021 International Conference on Multimedia Retrieval), and provides the training and the evaluation code for reproducing the reported results |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | The code has been very recently released. |
URL | https://github.com/June01/WFSAL-icmr21 |
Title | Performance over Random -- repository |
Description | The repository contains the implementation of "Performance over Random: A Robust Evaluation Protocol for Video Summarization Methods" (28th ACM International Conference on Multimedia (MM '20)) and can be used for evaluating the summaries of a video summarization method using the PoR evaluation protocol. |
Type Of Technology | Software |
Year Produced | 2020 |
Impact | The repository contains the implementation of "Performance over Random: A Robust Evaluation Protocol for Video Summarization Methods" (28th ACM International Conference on Multimedia (MM '20)) and can be used for evaluating the summaries of a video summarization method using the PoR evaluation protocol. |