Open-Ended Learning for Foundation Models
Lead Research Organisation:
University of Oxford
Abstract
In recent years, large neural networks, trained on vast datasets, have become increasingly prevalent across numerous areas of machine learning. These models are foundational to advancements across multiple disciplines, such as language modeling, computer vision, and reinforcement learning, as well as to applied fields including climate forecasting and autonomous vehicles. As these networks are trained, they typically excel at the majority of tasks presented within the dataset. However, certain tasks remain elusive, failing to be learned even after extensive training, largely because the process of random sampling from the dataset doesn't always expose the model enough to these challenges. As models advance, more complex tasks come within reach, yet gathering the necessary data and training on these harder problems becomes significantly more resource-intensive.
This research focuses on designing targeted curricula to identify and address the gaps in model learning, ensuring continuous performance improvement. In parallel, it explores training and optimization strategies aimed at preserving the model's learning flexibility, preventing the phenomenon of forgetting, and maintaining an open-ended learning trajectory. The ultimate objective is to develop a suite of techniques for enhancing pretrained models, making them more robust and capable of handling difficult, previously unlearnable tasks.
Aims and Objectives:
o Identify and expose models to tasks they struggle with, improving their ability to learn.
o Develop methods that prevent models from forgetting previously learned tasks while training on new ones.
o Create self-improving models that can avoid performance plateaus, potentially through techniques like online reinforcement learning (RL).
Novelty of the Research Methodology:
Current approaches to improving model performance tend to focus on scaling up both model parameters and dataset size. However, simply increasing the difficulty or volume of data becomes practical for particularly challenging problems. This research seeks to take a more intelligent approach by refining how models are trained and how datasets are curated, especially for tasks that are complex or costly to solve.
Alignment to EPSRC's Strategies and Research Areas:
The research aligns with the EPSRC's goals of fostering more secure and reliable systems. By developing models capable of solving harder applied problems, this work could significantly impact fields like chemistry, biology, and medicine, where robust models might help prevent infections or discover new treatments.
Collaborations:
The project is likely to involve collaboration with DeepMind, given their close alignment with the lab, as well as other researchers within the lab group.
This research focuses on designing targeted curricula to identify and address the gaps in model learning, ensuring continuous performance improvement. In parallel, it explores training and optimization strategies aimed at preserving the model's learning flexibility, preventing the phenomenon of forgetting, and maintaining an open-ended learning trajectory. The ultimate objective is to develop a suite of techniques for enhancing pretrained models, making them more robust and capable of handling difficult, previously unlearnable tasks.
Aims and Objectives:
o Identify and expose models to tasks they struggle with, improving their ability to learn.
o Develop methods that prevent models from forgetting previously learned tasks while training on new ones.
o Create self-improving models that can avoid performance plateaus, potentially through techniques like online reinforcement learning (RL).
Novelty of the Research Methodology:
Current approaches to improving model performance tend to focus on scaling up both model parameters and dataset size. However, simply increasing the difficulty or volume of data becomes practical for particularly challenging problems. This research seeks to take a more intelligent approach by refining how models are trained and how datasets are curated, especially for tasks that are complex or costly to solve.
Alignment to EPSRC's Strategies and Research Areas:
The research aligns with the EPSRC's goals of fostering more secure and reliable systems. By developing models capable of solving harder applied problems, this work could significantly impact fields like chemistry, biology, and medicine, where robust models might help prevent infections or discover new treatments.
Collaborations:
The project is likely to involve collaboration with DeepMind, given their close alignment with the lab, as well as other researchers within the lab group.
Organisations
People |
ORCID iD |
| Thomas Foster (Student) |
Studentship Projects
| Project Reference | Relationship | Related To | Start | End | Student Name |
|---|---|---|---|---|---|
| EP/S024050/1 | 30/09/2019 | 30/03/2028 | |||
| 2868356 | Studentship | EP/S024050/1 | 30/09/2023 | 29/09/2027 | Thomas Foster |