Solving Problems with Deep Learning in Scene Understanding in the Autonomous Driving Domain

Lead Research Organisation: University of York
Department Name: Computer Science

Abstract

I am interested in solving problems relating to scene understanding in the autonomous driving domain, using Deep Learning. Particularly I am interested in using a mixture of deep learning and geometric domain knowledge to help with accuracy or decision making.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509802/1 01/10/2016 31/03/2022
1949697 Studentship EP/N509802/1 01/10/2017 31/03/2022 Bruce Muller
 
Description The initial achievement has come from investigating whether we can use information we already know to improve the performance and explainability of deep models in machine learning, after noticing the tendency in the literature to use deep models as a black box and hope to learn all relevant features automatically. We have published research (VISAPP 2020 - A Hierarchical Loss for Semantic Segmentation) which indicates the potential of using prior knowledge of semantic class hierarchies (e.g. different types of vehicles and different types of natural phenomena can be related semantically in a hierarchy) to build a loss function which boosts the performance and training speed of deep models for semantic segmentation of images. While standard loss functions treat any classification error equally as bad (mistaking a truck for the sky is just as bad as mistaking a truck for a car), our implementation is able to differentiate serious errors from minor which may enable learning of more robust features, important for safety critical applications.

It should be noted that the main part of my project is still in progress and initial results are promising. We endeavour to build a working system for estimating camera localisation from one image to the next, specifically for road scenes, which could be extremely useful for applications such as autonomous driving. We have discovered a particular method for training a deep neural network to learn relative camera pose specifically to road scenes. Further, we have acquired a very large and high quality road-scene dataset (unavailable to the wider community) which will enable use to build models which work well for relative camera pose on unseen images. Additionally we have formulated a novel process for training our model to function on road-scenes, which is missing from the current literature. Like our initially described published contribution, the next will make use of prior understanding of the 3D world to help make deep models more transparent and potentially solve the problem of relative camera localisation useful to a degree sufficient for industrial use in real-time road-scene applications.
Exploitation Route Firstly the use of a hierarchical loss function which differentiates serious and minor errors, could be put to use by many people to train deep models faster and boost performance, so long as they have classes with a clear hierarchy. The idea is very generalisable and requires little effort to integrate into most systems. The idea can be researched further to provide results which show clearly the robustness of features compared to training with traditional methods. Further, this research is not limited to semantic segmentation and in theory can be applied to any classification task.

Secondly, depending on how current research progresses, people could use our relative pose estimator in the wild to accurately and quickly estimate camera movement in road scenes. This is useful for many tasks and industries. For example, it could be used by local councils and other road surveyors to build accurate and up to date maps of the ever changing road conditions.
Sectors Government, Democracy and Justice,Transport

 
Title Relative Pose Estimator 
Description We are building a deep learning model for estimating relative pose from image pairs of a single camera specific to planar scenes (road scenes in our case). 
Type Of Material Computer model/algorithm 
Year Produced 2020 
Provided To Others? No  
Impact Current results are promising and potentially could provide highly accurate estimation of how a vehicle moves from one point to the next purely based upon a single camera. Many applications can benefit from relative pose estimation, for example stitching images to form an overall map of the road. Potential benefactors include road surveyors and local councils for automated/assisted road maintenance and potentially autonomous driving applications. 
 
Description Gaist Solutions Dataset 
Organisation Gaist Solutions
Country United Kingdom 
Sector Private 
PI Contribution Developing a novel system (still in progress) for estimating relative camera pose from image pairs using deep learning models on road scenes. Gaist are road surveyors and this could be very helpful in some of their pipelines.
Collaborator Contribution Gaist provide a large sample of their road-scene dataset which is enabling us to train deep neural networks for the task of relative camera pose estimation in road-scenes.
Impact The collaboration is not multi-disciplinary and only involves the sharing of data on Gaist's part. The research is still in progress and so does not have outputs just yet but initial results are promising.
Start Year 2019