Use of visual tracking techniques and subsequent behaviour modelling for applications within the automotive industry

Lead Research Organisation: University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP

Abstract

The project will investigate the use of visual tracking techniques and subsequent behaviour
modelling, for various applications within the automotive industry.
In terms of driver assistance, the project will learn relationships between the environment and the
drivers controls. If these behavioural norms are broken (i.e. the system expects the driver to begin
braking, but they do not) it is possible that the driver has missed an important environmental cue
due to distraction, tiredness, etc. In this case the system can highlight to the driver (for example by
subtle highlighting) exactly which elements of the scene it believes should be contributing to a
change in driving behaviour. By exploiting low latency sensors such as event cameras, the same
system may be taken to extremes, providing rapid "reflex level" assistance to drivers travelling at
high speeds.

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R512217/1 30/09/2017 31/01/2022
1949013 Studentship EP/R512217/1 24/09/2017 31/01/2022 Jaime Spencer Martin
 
Description Feature representations are high-dimensional descriptions of image content that are more useful than the RGB data. These are a vital part of any computer vision pipeline as they help the machine learning system to make sense of the visual data. A good feature representation should be robust to changes in illumination and viewpoint, in order to reliably match points across multiple images. Traditionally, feature description has been done in a hand-crafted sparse manner, where only a few points in the image are described using properties researchers thought should be useful. In this project we instead do this in a dense manner (i.e. every pixel in the image) by using deep learning.

Training these networks typically requires labels indicating if two points are a correct (positive) or incorrect (negative) match. We showed how selecting different sets of negative examples results in features that are good for different tasks. For example, picking negative points from all across the image gives better results in semantic segmentation, where the global context is more important. On the other hand, forcing negative examples from a small area around the correct match made features focus on the local patches, important for disparity estimation.

However, it is not always possible to obtain the ground truth positive and negative matches needed for training. This is especially hard when the data has been collected over a large prediod of time, e.g. months or years. Using this data would be beneficial, since it could make the network robust to long-term changes (e.g. day vs. night, summer vs. winter). In order to do this, we introduced a training scheme which only needed to know whether two images were taken roughly at the same location. We defined two images as similar if each feature in the image has only one good match in the second image.

As an extension to this idea, we developed a system that is capable of learning multiple different tasks at the same time. In this case, depth/motion estimation and feature representations. Using depth and motion information allowed us to establish what points form correct matches. These matches were then used in a similar setting as described previously, where we labeled points of pairs as positive or negative. On the other hand, using the learned feature representations instead of the RGB images allowed us to develop a more robust loss to train the depth estimation network. This allowed us to again use seasonal data, which helped the system perform better in these challenging conditions.
Exploitation Route Since feature learning is a relatively low level task it is at the core of many different applications. The most common of these is correspondece estimation, which in turn can be used in localization or motion estimation. Since we focus on learning dense features, we have also shown how these can be used in a much wider range of tasks, including disparity estimation and semantic segmentation.

One interesting avenue for future research would be to use these features for multiple simultaneous tasks (i.e. multi-tasking). This would allow for the development of more efficient systems that perform multiple tasks (e.g. autonomous vehicles), improving speed and memory usage.
Sectors Digital/Communication/Information Technologies (including Software),Transport

 
Description This project has helped to foster collaboration with our industrial partner. We hope that this will open the avenue for future collaboration between multiple academic/industry teams. Through this project we have also been able to inform their internal research and suggest topics for future directions, in turn informing decisions about future products.
First Year Of Impact 2018
Sector Digital/Communication/Information Technologies (including Software),Transport
Impact Types Economic

 
Description FEPS Faculty Research Support Fund (FRSF) Award - PGR Conference Grant
Amount £995 (GBP)
Funding ID PGR CG 19-106 
Organisation University of Surrey 
Sector Academic/University
Country United Kingdom
Start 05/2019 
End 09/2019
 
Title DeFeat-Net Codebase 
Description This repository contains the network architectures, losses and pretrained models from DeFeat-Net. 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
Impact The code has been released publicly, allowing other researchers to use it in their projects, as well as ensuring that our published results are reproducible. 
URL https://github.com/jspenmar/DeFeat-Net
 
Title Deja-Vu Features Codebase 
Description This repository contains the network architecture, losses and pretrained feature models from Deja-Vu. 
Type Of Technology Software 
Year Produced 2020 
Open Source License? Yes  
Impact The code has been released publicly, allowing other researchers to use it in their projects, as well as ensuring that our published results are reproducible. 
URL https://github.com/jspenmar/DejaVu_Features
 
Title Scale Adaptive Neural Dense Features Codebase 
Description This repository contains the network architecture and pretrained feature models from "Scale-Adaptive Neural Dense Features". Additionally, we provide a simple script to load these models and run inference on a sample Kitti image. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact The code has been released publicly, allowing other researchers to use it in their projects, as well as ensuring that our published results are reproducible. 
URL https://github.com/jspenmar/SAND_features
 
Description Talk at the Geometry and Deep Learning BMVA symposium 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact We presented our recently published conferece paper (Scale Adaptive Neural Dense Features) at the Geometry & Deep Learning symposium organized by the BMVA. The talks were attented by about 70 participants, including a mix of researchers and industry professionals. Following the talk, I engaged with multiple participants, clarifying questions about the presented work and talking about possible avenues for future work.
Year(s) Of Engagement Activity 2019
URL https://www.youtube.com/watch?v=d85B24kpH7g