# Geometrical and Representation Learning in Deep Generative Models

Lead Research Organisation:
University of Oxford

Department Name: Statistics

### Abstract

Learning well-specified probabilistic models is at the heart of many problems in machine learning and statistics. Much focus has therefore been placed on developing methods for modelling and inferring expressive probability distributions. Deep generative models aim to combine the quantified uncertainty offered by probabilistic modelling, with the flexibility and scalable learning of deep neural networks. These models have shown great promise for this task as they have been successfully applied on a wide spectrum of datasets, ranging from computer vision and earth science to computational biology and molecular physics. This research area is one of the most rapidly evolving fields of statistical machine learning.

Most celebrated deep generative models includes Generative Adversarial Networks (GANs) (Goodfellow et al., 2014), Variational Auto-Encoders (VAEs) (Kingma and Welling, 2014; Rezende et al., 2014) and Normalizing Flows (NFs) (Rezende and Mohamed, 2016).

Some of these models - e.g. VAEs or GANs - are latent variable models, i.e. they relate observed data-points to hidden latent variables through some stochastic process. These variables seek to explain, or represent, observations in a compressed and more interpretable manner. The field of representation learning (Bengio et al., 2014) aims at learning a representation that captures the probability distribution of the underlying explanatory features for the observed input. These representations can be useful for their own sake, being humanely-interpretable, or for automated downstream tasks such as classification. An oft-stated motivation for learning disentangled representations of data with deep generative models is a desire to achieve interpretability (Chen et al., 2017) of latent representations to admit intuitive explanations. Most work has focused on capturing purely independent factors of variation. Consequently, we develop in Chapter 2 a generalisation of disentanglement in VAEs that allows for richer structures than independence such as such as sparsity and clustering.

Tangentially, it can be argued that in many domains data should be represented hierarchically. For example, in cognitive science, it is widely accepted that human beings use a hierarchy to organise object categories (e.g. Roy et al., 2006; Collins and Quillian, 1969; Keil, 1979). In biology, the theory of evolution (Darwin, 1859) implies that features of living organisms are related in a hierarchical manner given by the evolutionary tree. Explicitly incorporating hierarchical structure in probabilistic models has unsurprisingly been a long-running research topic (e.g. Duda et al., 2000; Heller and Ghahramani, 2005). However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. Therefore, in Chapter 3 we endow VAEs with a Poincaré ball model of hyperbolic geometry as a latent space and rigorously derive the

necessary methods to work with two main Gaussian generalisations on that space.

More generally, an important aspect of well-specified models is to correctly characterize the geometry which describes the proximity of data points. Riemannian manifolds provide a general framework for this purpose and are a natural approach to model tasks in many scientific fields ranging from earth and climate science to biology and computer vision (e.g. Karpatne et al., 2017; Hamelryck et al., 2006; Klimovskaia et al., 2019; Lui, 2012). If appropriately chosen, manifold-informed methods can lead to improved sample complexity and generalization, improved fit in the low parameter regime, and guide inference methods to interpretable models. They can also be understood as a geometric prior that encodes a practitioner's assumption about the data and imposes an inductive bias.

Most celebrated deep generative models includes Generative Adversarial Networks (GANs) (Goodfellow et al., 2014), Variational Auto-Encoders (VAEs) (Kingma and Welling, 2014; Rezende et al., 2014) and Normalizing Flows (NFs) (Rezende and Mohamed, 2016).

Some of these models - e.g. VAEs or GANs - are latent variable models, i.e. they relate observed data-points to hidden latent variables through some stochastic process. These variables seek to explain, or represent, observations in a compressed and more interpretable manner. The field of representation learning (Bengio et al., 2014) aims at learning a representation that captures the probability distribution of the underlying explanatory features for the observed input. These representations can be useful for their own sake, being humanely-interpretable, or for automated downstream tasks such as classification. An oft-stated motivation for learning disentangled representations of data with deep generative models is a desire to achieve interpretability (Chen et al., 2017) of latent representations to admit intuitive explanations. Most work has focused on capturing purely independent factors of variation. Consequently, we develop in Chapter 2 a generalisation of disentanglement in VAEs that allows for richer structures than independence such as such as sparsity and clustering.

Tangentially, it can be argued that in many domains data should be represented hierarchically. For example, in cognitive science, it is widely accepted that human beings use a hierarchy to organise object categories (e.g. Roy et al., 2006; Collins and Quillian, 1969; Keil, 1979). In biology, the theory of evolution (Darwin, 1859) implies that features of living organisms are related in a hierarchical manner given by the evolutionary tree. Explicitly incorporating hierarchical structure in probabilistic models has unsurprisingly been a long-running research topic (e.g. Duda et al., 2000; Heller and Ghahramani, 2005). However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. Therefore, in Chapter 3 we endow VAEs with a Poincaré ball model of hyperbolic geometry as a latent space and rigorously derive the

necessary methods to work with two main Gaussian generalisations on that space.

More generally, an important aspect of well-specified models is to correctly characterize the geometry which describes the proximity of data points. Riemannian manifolds provide a general framework for this purpose and are a natural approach to model tasks in many scientific fields ranging from earth and climate science to biology and computer vision (e.g. Karpatne et al., 2017; Hamelryck et al., 2006; Klimovskaia et al., 2019; Lui, 2012). If appropriately chosen, manifold-informed methods can lead to improved sample complexity and generalization, improved fit in the low parameter regime, and guide inference methods to interpretable models. They can also be understood as a geometric prior that encodes a practitioner's assumption about the data and imposes an inductive bias.

## People |
## ORCID iD |

Yee Teh (Primary Supervisor) | |

Emile Mathieu (Student) |

### Studentship Projects

Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|

EP/P510609/1 | 30/09/2016 | 29/09/2021 | |||

2378872 | Studentship | EP/P510609/1 | 30/09/2017 | 14/07/2021 | Emile Mathieu |