3D Shape Understanding using Deep Learning

Lead Research Organisation: University of Oxford
Department Name: Engineering Science


In the computer vision research area much of the focus at the moment is on applying deep learning to perform a variety of tasks, such as segmentation, depth estimation, and object detection. We hope to apply these techniques to be able to extract information about the 3D shape of an object given a single image. This would build on recent success of computing the depth and surface normals of a scene given a single image. However, our approach would focus on the 3D object understanding as opposed to a 2.5D representation.

Of the current approaches in this area to understanding 3D shape/properties, they subdivide into four main categories. The first uses neural networks to perform object detection from points clouds (a 3D representation of an object). However, we want to focus on connecting the 2D representation of an image with its corresponding 3D model. The second approach estimates 3D properties (such as vanishing points or symmetry) from an image. The third approach uses known representations about the class of object being detected to morph a model of the class into the object instance. However, we wish to extract generic shape information, without knowledge of the object instance. The fourth, a very new direction, attempts to directly extract the 3D shape from the image. The ability to extract the 3D shape is only recently becoming possible due to datasets such as ShapeNet [1] which includes a large collection of 3D shapes for a given object category. However, this approach is still in its infancy at the moment and the approaches are not robust nor qualitatively very good.

Our aims are to investigate ways of automatically understanding the 3D shapes/properties better. This problem is extremely difficult, as it is underdetermined and there are a range of possible shapes that could correspond to one image. We aim to combine deep learning methods along with information that can be extracted about the object, such as its axes of symmetry, to produce qualitatively and quantitatively good models. We also are investigating the best way to do this: whether to directly estimate the shape or implicitly estimate it using an auxiliary task. An effective, general purpose and robust approach to 3D shape understanding would advance the state-of-the-art in the field.

Our objective is to be able to, given an image, be able to understand the 3D properties of the objects in the scene. This would be useful for a variety of tasks such as grasp detection (e.g. how would a robot pick up an item) and improved object recognition (if we have a good understanding of the object's shape then finding it in completely different viewpoints should be easier and rely on minimal training examples).

This project aligns with the EPSRC ICT research area, as we are hoping to advance the state of the art with Computer Vision, helping to pave the way for its application in other Engineering domains.

[1] Chang, Angel X., et al. "Shapenet: An information-rich 3d model repository."arXiv preprint arXiv:1512.03012 (2015).

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509711/1 01/10/2016 30/09/2021
1798398 Studentship EP/N509711/1 01/10/2016 30/09/2020 Olivia Wiles
Description I have investigated a variety of ways to represent and infer 3D from single or multiple images.
Exploitation Route Being able to infer 3D has many applications in autonomous driving, computational photography, and VR/AR. Downstream work has built on my or similar ideas in particular in the domains of computational photography (for example the 3D photos app from Facebook is related).
Sectors Creative Economy,Culture, Heritage, Museums and Collections