DEVA - Autonomous Driving with Emergent Visual Attention

Lead Research Organisation: University of Exeter
Department Name: Engineering Computer Science and Maths

Abstract

How does a racer drive around a track? Approaching a bend in the road, a driver needs to monitor the road, steer around curves, manage speed and plan a trajectory avoiding collisions with other cars - and all of this, fast and accurately. For robots this remains a challenge: despite progress in computer vision over the last decades, artificial vision systems remain far from human vision in performance, robustness and speed. As a consequence, current prototypes of self-driving cars rely on a wide variety of sensors to palliate the limitations of their visual perception. One crucial aspect that distinguishes human from artificial vision is our capacity to focus and shift our attention. This project will propose a new model of visual attention for a robot driver, and investigate how attention focusing can be learnt automatically by trying to improve the robot's driving.

How and where we focus our attention when solving a task such as driving is studied by psychologists, and the numerous models of attention can be sorted in two categories: first, top-down models capture how world knowledge and expectations guide our attention when performing a specific task; second, bottom-up models characterise how properties of the visual signal make specific regions capture our attention, a property often referred to as saliency. Yet, from a robotics perspective, there remains a lack of a unified framework describing the interplay of bottom-up and top-down attention, especially for a dynamic, time-critical task such as driving. In the racing scenario described above, the driver must take quick and decisive action to steer around bends and avoid obstacles - efficient use of attention is therefore critical.

This project will investigate the hypothesis that our attention mechanisms are learnt on a task specific basis, in a such a way as to provide our visual system optimal information for performing the task. We will investigate how state-of-the-art computer vision and machine learning approaches can be used to learn attention, perception and action jointly to allow a robot driver to compete with humans on a racing simulator, using visual perception only.

A generic learning framework for task-specific attention will be developed that is applicable across a broad range of visual tasks, and bears
the potential for reducing the gap with human performance by a critical reduction in current processing times.

Planned Impact

This project will have impact in three communities:
(1) Computer vision and robotics community
(2) Car safety and autonomous cars industry
(3) Psychologists in attention research

The computer vision and robotics community will benefit directly from the new knowledge and techniques developed during this project. By proposing a new approach to reduce the amount of visual data to be processed while solving robotic tasks, the proposed framework could lead to significant improvements in efficiency for vision-based robotics. Additionally, the proposed scenario will offer new insights on the applicability of the embodied cognition paradigm to a wider class of computer vision problems. To ensure maximal impact, in addition to the academic papers, the code will be released within two popular code bases: ROS and OpenCV. Additionally, the software required to interface with the racing simulator will also be released to foster comparison.

Moreover, this project will devise new tools and approaches to the driver assistance and driver-less cars industry. Monitoring the driver's attention is becoming an essential concern as more sophisticated cars also provide more distractions for the driver. This project will provide a better understanding on the ideal gaze patterns when driving. In addition, the attention process developed in this work will provide efficient alternatives to current vision-based driving systems, potentially reducing the reliance on additional sensors.

Finally, this project also has the potential to impact the psychological community by providing a new analysis tool for eye gaze in dynamic tasks based on the proposed model. Eye tracking is a popular paradigm for the analysis of human subjects' attention shifts, applied to a broad range of cognitive tasks. The proposed approach will provide a new tool for analysing attentional patterns, by comparing human gaze locations with locations where an optimal information processing system would focus its attention when solving the given task.
 
Description The project has been investigating the importance of attention in visual perception and in particular for active tasks. We have investigated computational models attempting to predict where an observer would look in an image, an property called saliency.
1. Saliency as detection: Our research has investigated the state-of-the-art models, based on deep neural networks, and found that the common formulation as a a regression of a complete saliency map is inefficient: Fundamentally, either a location is salient or it is not, and therefore the problem can be effectively reframed as detecting high saliency regions. We demonstrated that similar performance could be obtained compared to state-of-the-art models, with much reduced learning time (under review).
2. We devised a new approach for visualising what is learnt by deep saliency models, and demonstrated that those models have developed receptive fields consistent with psychological theories (published).
3. We also investigated reinforcement learning approaches for learning to steer from visual images, and developed a new RL algorithm to address the poor efficiency of state-of-the-art RL with respect to the number of training episodes (under review).
Exploitation Route The project is still ongoing. We have more findings that are currently under review and in preparation for publication and are in discussion for applying for further funding to continue this axis of research.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Christmas Lecture 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact This christmas lecture was given by the PI to a public of school children, informing on autonomous cars technologies, new developments and remaining challenges, as well as some of the aspects of ML for autonomous cars studied in the DEVA project. The presentation raised many interested questions from the students, and lead to an interesting discussion on ethical and societal aspects of the technology and the research that underlies it.
Year(s) Of Engagement Activity 2017
URL https://emps.exeter.ac.uk/news-events/events-colloquia/event/?semID=2129&dateID=4747