Deep Learning-based Computer Vision Methods

Lead Research Organisation: University of Oxford

Department Name: Engineering Science

Abstract

Visual understanding of 3D from images and video
This project falls within the EPSRC Information and Communication Technologies research area.

The objective of this research is to be able to match, describe and categorise image and video content. The aim here is to achieve human like performance and beyond, for example in recognising configurations of parts and spatial layout, counting and delineating objects, or recognising human actions and inter-actions in videos, significantly superseding the current limitations of computer vision systems, and enabling new and far reaching applications. This research will build on top of the success of discriminatively trained recognition systems to recognise visual content in images. The research will involve developing combined generative and discriminative models that can infer the attributes and shape of novel visual material. The new algorithms will learn automatically, building on recent breakthroughs in deep machine learning. They will be capable of weakly-supervised learning, for example from imagines and videos downloaded from the internet, and require very little human supervision.

There are applications of this work to image categorisation and search in large image and video datasets. Our vision is that anything visual should be searchable for, in the manner of a Google search of the web: by specifying a query, and having results returned immediately, irrespective of the size of the data. Such enabling capabilities will have widespread application both for general image/video search - consider how Google's web search has opened up new areas - and also for designing customised solutions for searching. Another application is the automatic counting of large numbers of objects in imagines. This is used in a variety of domains, from counting cells in medical images to counting cars in traffic or satellite data.

The ubiquity of digital imagine means that every UK citizen may potentially benefit from the Programme research in different ways. One example is an enhanced iplayer that can search for where particular characters appear in a programme, or intelligently fast forward to the next 'hugging' sequence. A second is wider deployment of lower cost imaging solutions in healthcare delivery. A third, also motivated by healthcare, is through the employment of new machine learning methods for validating targets for drug discovery based on microscopy images.

Student:

Erika Lu

Period of Study:

Oct 17 - Mar 21

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

1940016

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Andrew Zisserman (Primary Supervisor)
Erika Lu (Student)

Publications

Author Name Title Publication Date Published

10 25 50

Lu E (2019) Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, Perth, Australia, December 2-6, 2018, Revised Selected Papers, Part III

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/N509711/1			01/10/2016	30/09/2021
1940016	Studentship	EP/N509711/1	01/10/2017	31/03/2021	Erika Lu

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects