Deep Learning-based Computer Vision Methods

Lead Research Organisation: University of Oxford
Department Name: Engineering Science


Visual understanding of 3D from images and video
This project falls within the EPSRC Information and Communication Technologies research area.

The objective of this research is to be able to match, describe and categorise image and video content. The aim here is to achieve human like performance and beyond, for example in recognising configurations of parts and spatial layout, counting and delineating objects, or recognising human actions and inter-actions in videos, significantly superseding the current limitations of computer vision systems, and enabling new and far reaching applications. This research will build on top of the success of discriminatively trained recognition systems to recognise visual content in images. The research will involve developing combined generative and discriminative models that can infer the attributes and shape of novel visual material. The new algorithms will learn automatically, building on recent breakthroughs in deep machine learning. They will be capable of weakly-supervised learning, for example from imagines and videos downloaded from the internet, and require very little human supervision.

There are applications of this work to image categorisation and search in large image and video datasets. Our vision is that anything visual should be searchable for, in the manner of a Google search of the web: by specifying a query, and having results returned immediately, irrespective of the size of the data. Such enabling capabilities will have widespread application both for general image/video search - consider how Google's web search has opened up new areas - and also for designing customised solutions for searching. Another application is the automatic counting of large numbers of objects in imagines. This is used in a variety of domains, from counting cells in medical images to counting cars in traffic or satellite data.

The ubiquity of digital imagine means that every UK citizen may potentially benefit from the Programme research in different ways. One example is an enhanced iplayer that can search for where particular characters appear in a programme, or intelligently fast forward to the next 'hugging' sequence. A second is wider deployment of lower cost imaging solutions in healthcare delivery. A third, also motivated by healthcare, is through the employment of new machine learning methods for validating targets for drug discovery based on microscopy images.

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509711/1 01/10/2016 30/09/2021
1940016 Studentship EP/N509711/1 01/10/2017 30/05/2021 Erika Lu