Deep learning in parameter space
Lead Research Organisation:
University of Oxford
Abstract
Brief description of the context of the research including potential impact: With the rapid
growth of publicly available deep learning models, such as over a million open-source models
on Hugging Face, it is increasingly feasible to build "meta-models" that operate directly on
model parameters. These meta-models can help us perform tasks like editing representations
(e.g. NeRFs for images and 3D scenes), compressing model weights, predicting model accuracy,
distilling large models, domain adaptation, automatic finetuning, and more. The primary
challenge in training such meta-models lies in the high-dimensionality and representational
redundancies in neural network parameter spaces.
Aims and Objectives: Design efficient, equivariant meta-models that can learn effectively on
high-dimensional parameter spaces of neural networks. These models will leverage parameter
symmetries to reduce the complexity of processing large weights.
Novelty of the research methodology: While neural network parameter spaces can be
extremely high-dimensional, they are intrinsically symmetric-there exists large groups of
transformations that leave the underlying model unchained. This project utilizes these
symmetries to construct scalable, equivaraint meta-model architectures, enabling more
efficient learning and optimization in these high-dimensional settings.
Alignment to EPSRC's strategies and research areas: This project aligns with the EPSRC's
"Artificial Intelligence Technologies" and "Algebra" themes, employing algebraic tools (such as
group theory and representation theory) to exploit symmetries in deep learning parameter
spaces, enhancing AI model performance
growth of publicly available deep learning models, such as over a million open-source models
on Hugging Face, it is increasingly feasible to build "meta-models" that operate directly on
model parameters. These meta-models can help us perform tasks like editing representations
(e.g. NeRFs for images and 3D scenes), compressing model weights, predicting model accuracy,
distilling large models, domain adaptation, automatic finetuning, and more. The primary
challenge in training such meta-models lies in the high-dimensionality and representational
redundancies in neural network parameter spaces.
Aims and Objectives: Design efficient, equivariant meta-models that can learn effectively on
high-dimensional parameter spaces of neural networks. These models will leverage parameter
symmetries to reduce the complexity of processing large weights.
Novelty of the research methodology: While neural network parameter spaces can be
extremely high-dimensional, they are intrinsically symmetric-there exists large groups of
transformations that leave the underlying model unchained. This project utilizes these
symmetries to construct scalable, equivaraint meta-model architectures, enabling more
efficient learning and optimization in these high-dimensional settings.
Alignment to EPSRC's strategies and research areas: This project aligns with the EPSRC's
"Artificial Intelligence Technologies" and "Algebra" themes, employing algebraic tools (such as
group theory and representation theory) to exploit symmetries in deep learning parameter
spaces, enhancing AI model performance
Organisations
People |
ORCID iD |
| Yoav Gelberg (Student) |
Studentship Projects
| Project Reference | Relationship | Related To | Start | End | Student Name |
|---|---|---|---|---|---|
| EP/S024050/1 | 30/09/2019 | 30/03/2028 | |||
| 2868388 | Studentship | EP/S024050/1 | 30/09/2023 | 29/09/2027 | Yoav Gelberg |