Solving Problems with Deep Learning in Scene Understanding in the Autonomous Driving Domain

Lead Research Organisation: University of York

Department Name: Computer Science

Abstract

I am interested in solving problems relating to scene understanding in the autonomous driving domain, using Deep Learning. Particularly I am interested in using a mixture of deep learning and geometric domain knowledge to help with accuracy or decision making.

Student:

Bruce Muller

Period of Study:

Sep 17 - Mar 22

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

1949697

Research Topic:

Unclassified

Organisations

People	ORCID iD
Bruce Muller (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Muller BR (2020) A Hierarchical Loss for Semantic Segmentation

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/N509802/1			30/09/2016	30/03/2022
1949697	Studentship	EP/N509802/1	30/09/2017	30/03/2022	Bruce Muller

Key Findings
Research Databases and Models
Collaboration


Description	The initial achievement has come from investigating whether we can use information we already know to improve the performance and explainability of deep models in machine learning, after noticing the tendency in the literature to use deep models as a black box and hope to learn all relevant features automatically. We have published research (VISAPP 2020 - A Hierarchical Loss for Semantic Segmentation) which indicates the potential of using prior knowledge of semantic class hierarchies (e.g. different types of vehicles and different types of natural phenomena can be related semantically in a hierarchy) to build a loss function which boosts the performance and training speed of deep models for semantic segmentation of images. While standard loss functions treat any classification error equally as bad (mistaking a truck for the sky is just as bad as mistaking a truck for a car), our implementation is able to differentiate serious errors from minor which may enable learning of more robust features, important for safety critical applications. It should be noted that the main part of my project is still in progress and initial results are promising. We endeavour to build a working system for estimating camera localisation from one image to the next, specifically for road scenes, which could be extremely useful for applications such as autonomous driving. We have discovered a particular method for training a deep neural network to learn relative camera pose specifically to road scenes. Further, we have acquired a very large and high quality road-scene dataset (unavailable to the wider community) which will enable use to build models which work well for relative camera pose on unseen images. Additionally we have formulated a novel process for training our model to function on road-scenes, which is missing from the current literature. Like our initially described published contribution, the next will make use of prior understanding of the 3D world to help make deep models more transparent and potentially solve the problem of relative camera localisation useful to a degree sufficient for industrial use in real-time road-scene applications.
Exploitation Route	Firstly the use of a hierarchical loss function which differentiates serious and minor errors, could be put to use by many people to train deep models faster and boost performance, so long as they have classes with a clear hierarchy. The idea is very generalisable and requires little effort to integrate into most systems. The idea can be researched further to provide results which show clearly the robustness of features compared to training with traditional methods. Further, this research is not limited to semantic segmentation and in theory can be applied to any classification task. Secondly, depending on how current research progresses, people could use our relative pose estimator in the wild to accurately and quickly estimate camera movement in road scenes. This is useful for many tasks and industries. For example, it could be used by local councils and other road surveyors to build accurate and up to date maps of the ever changing road conditions.
Sectors	Government Democracy and Justice Transport


Title	Relative Pose Estimator
Description	We are building a deep learning model for estimating relative pose from image pairs of a single camera specific to planar scenes (road scenes in our case).
Type Of Material	Computer model/algorithm
Year Produced	2020
Provided To Others?	No
Impact	Current results are promising and potentially could provide highly accurate estimation of how a vehicle moves from one point to the next purely based upon a single camera. Many applications can benefit from relative pose estimation, for example stitching images to form an overall map of the road. Potential benefactors include road surveyors and local councils for automated/assisted road maintenance and potentially autonomous driving applications.


Description	Gaist Solutions Dataset
Organisation	Gaist Solutions
Country	United Kingdom
Sector	Private
PI Contribution	Developing a novel system (still in progress) for estimating relative camera pose from image pairs using deep learning models on road scenes. Gaist are road surveyors and this could be very helpful in some of their pipelines.
Collaborator Contribution	Gaist provide a large sample of their road-scene dataset which is enabling us to train deep neural networks for the task of relative camera pose estimation in road-scenes.
Impact	The collaboration is not multi-disciplinary and only involves the sharing of data on Gaist's part. The research is still in progress and so does not have outputs just yet but initial results are promising.
Start Year	2019

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects