Generative models on manifold

Lead Research Organisation: University of Oxford
Department Name: Statistics

Abstract

Brief description of the context of the research including potential impact In this project we will study generative models under the manifold learning hypothesis. In particular we will focus on diffusion models, as a special case of generative models which have enjoyed recently much attention in the machine learning community due to their immense empirical success. Diffusion models now find applications in a wide range of fields including image and sound generation, medicine, protein design etc. Over the last three years there has been a number of advances to understand why and when these models perform well in practice, see for instance Sohl-Dickstein et al. 2015; Song et al. 2020 for the early work on such approaches and De Bortoli et al. 2021; Oko, Akiyama, and Suzuki 2023 to name but a few [add more refs here]. Apart from Bortoli 2022 and to some extent Oko, Akiyama, and Suzuki 2023, little is known on the behaviour of the generative procedure when the distribution of the data belongs to a low dimensional sub-manifold of the ambient space, except for the relatively simple and unrealistic scenario where the the manifold is affine. We will first extend the work of Oko, Akiyama, and Suzuki 2023 to the case where the density has a given smoothness B on a general, unknown manifold with dimension d. We will begin by considering that the dimension of the ambient space D is fixed and study how diffusion generative models can adapt to the manifold and to the smoothness of the density . We will then investigate the more complicated framework of high dimensional D, i.e. D grows with n. Aims and Objectives We aim to understand the behaviour of denoising diffusion models trained on data living in a manifold, an important application, since it is understood that many important datasets satisfy the manifold hypothesis, e.g. imaging data.

As a by-product we will gain insight into the sample complexity of diffusion denoising models, an important theoretical questions that is largely open. We also expect to gain some insight into how neural networks adapt to the geometry of the data-set; this is a difficult question, so even modest progress will be of great importance.
Novelty of the research methodology Denoising diffusion models are a fairly recent type of generative model. Despite impressive empirical success our theoretical understanding of their properties is still very limited especially in the important scenario where the data lives on the manifold. The problem we will be tackling is therefore open. The asymptotic approach we propose is also novel in the context of denoising diffusion models; the literature has focused mainly on non-asymptotic bounds but for relatively simple scenarios like affine manifolds. Considering asymptotic bounds is a relaxation which may allow us to treat much more general scenarios.

Alignment to EPSRC strategies and research areas This project falls within the EPSRC Statistics and applied probability and also Artificial intelligence technologies.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517811/1 30/09/2020 29/09/2025
2887804 Studentship EP/T517811/1 30/09/2023 29/09/2027 Iskander Azangulov
EP/W524311/1 30/09/2022 29/09/2028
2887804 Studentship EP/W524311/1 30/09/2023 29/09/2027 Iskander Azangulov