GNOMON: Deep Generative Models in non-Euclidean Spaces for Computer Vision & Graphics

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

Over the past decade, deep learning methods have had an enormous impact on the academic and industrial worlds, opening new multi-billion markets ranging from driver-less cars to speech recognition and machine translation. Deep learning has been an emerging technology for decades; it took an orchestrated scientific and engineering effort as well as harnessing of the increasing computational power and large datasets to achieve an overarching technological and societal impact. Most of the successful deep learning methods such as Deep Convolutional Neural Networks (DCNNs) are based on classical signal/image processing models that limit their applicability to data with underlying Euclidean grid-like structure, e.g., 2D/3D images or audio signals. Non-Euclidean (graph-or manifold-structured) data are becoming increasingly abundant; prominent examples include 3D objects (represented as meshes or point clouds) in CV and graphics, as well as social networks, graphs of molecules, and interactomes. Until recently, this has been a significant obstacle precluding the adoption of ML tools in some of the most promising fields. To bridge the gap between Euclidean (e.g., images, videos & speech) and non-Euclidean (e.g., graph and manifolds) ML umbrella terms have recently been coined, such as ''Geometric Deep Learning'' (GDL).

Such methods have gained a keen interest in the ML community the past couple of years since graphs can model very abstract systems of relations or interactions, and thus potentially applied across the board. Recent successful examples of the application of non-Euclidean deep learning are as diverse as semantic segmentation on meshes and point clouds, drug-design and event classification in particle physics. Nevertheless, the focus is mainly on discriminative approaches (e.g., classification and segmentation problems) and limited progress has been made towards generative methodologies (i.e., unsupervised methodologies that model the distribution of data) on non-Euclidean spaces. The drawback of discriminative methodologies is that they require a massive amount of labelled, mainly manually, data, which is very expensive, or even impossible to find in many settings. On the other hand, generative approaches can operate in unsupervised scenarios and can even be used to produce data that can be utilised to train discriminative approaches. Currently, available generative frameworks have been developed primarily for Euclidean data (e.g., images, videos) and are not suitable for the non-Euclidean setting.

GNoMON aims at bridging this gap by developing a mathematically principled framework for designing and implementing Generative Models for non-Euclidean domains such as graphs or manifolds. We will explore challenging problems in 3D CV and graphics. Nevertheless, the developed techniques will be designed in such a way to be general so that can aid the research in many other fields.

Publications

10 25 50
publication icon
Alexandridis KP (2023) Inverse Image Frequency for Long-Tailed Image Recognition. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

publication icon
Rainer G (2023) Neural Shading Fields for Efficient Facial Inverse Rendering in Computer Graphics Forum