Phenomenology of Deep Learning
Lead Research Organisation:
University of Oxford
Department Name: Oxford Physics
Abstract
Recent Machine Learning (ML) breakthroughs in industry and the sciences rely on neural network architectures with multiple layers, a
branch of ML called Deep Learning (DL). Over the last decade, much of this progress was possible thanks to the impressive growth of
the size of datasets and neural networks. In contrast, the understanding of the foundations of DL has not followed this successful
trend. In fact, there is an enormous gap between the practical success of DL and our understanding of why DL works so well. It is
widely acknowledged that to expand the scope of ML applications and obtain reliable artificial intelligence systems, we must achieve
a fundamental understanding of DL.
A physics-based framework can provide a unique perspective to pressing questions in DL and contribute to filling the gap between
practical developments and theoretical foundations. We stress that, while ML applications are broadly used in physics, the flow of
ideas in the opposite direction, i.e., the use of concepts and techniques from theoretical physics to understand modern deep learning,
has only started to be explored.
In this MSCA, we will exploit and capitalize on the striking similarities between deep neural networks and effective field theory in
physics. The goal of "Phenomenology of Deep Learning" (PHENO-DL) is to contribute to the development of an effective theory of
deep learning. We will make use of physics model-building tools to attack foundational questions (e.g., how do deep neural networks
generalize?) and remarkable phenomena in deep learning (namely, double descent and adversarial examples). To this end, we will
adopt a setup inspired by established methods from theoretical physics, which have been recently applied to neural networks. In
particular, to explain the neural network's expressivity and capacity, we will use the interplay between the renormalisation group in
physics and the 'hierarchy of features' in the different layers in a deep neural network.
branch of ML called Deep Learning (DL). Over the last decade, much of this progress was possible thanks to the impressive growth of
the size of datasets and neural networks. In contrast, the understanding of the foundations of DL has not followed this successful
trend. In fact, there is an enormous gap between the practical success of DL and our understanding of why DL works so well. It is
widely acknowledged that to expand the scope of ML applications and obtain reliable artificial intelligence systems, we must achieve
a fundamental understanding of DL.
A physics-based framework can provide a unique perspective to pressing questions in DL and contribute to filling the gap between
practical developments and theoretical foundations. We stress that, while ML applications are broadly used in physics, the flow of
ideas in the opposite direction, i.e., the use of concepts and techniques from theoretical physics to understand modern deep learning,
has only started to be explored.
In this MSCA, we will exploit and capitalize on the striking similarities between deep neural networks and effective field theory in
physics. The goal of "Phenomenology of Deep Learning" (PHENO-DL) is to contribute to the development of an effective theory of
deep learning. We will make use of physics model-building tools to attack foundational questions (e.g., how do deep neural networks
generalize?) and remarkable phenomena in deep learning (namely, double descent and adversarial examples). To this end, we will
adopt a setup inspired by established methods from theoretical physics, which have been recently applied to neural networks. In
particular, to explain the neural network's expressivity and capacity, we will use the interplay between the renormalisation group in
physics and the 'hierarchy of features' in the different layers in a deep neural network.
Organisations
People |
ORCID iD |
| Adriaan Louis (Principal Investigator) | |
| Nayara Fonseca De Sa (Fellow) |
Publications
Nam Y, Fonseca N
(2024)
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
| Description | A big mystery in modern deep learning systems such as large language models is their behaviour with increasing amounts of data or compute time. New skills appear to suddenly emerge, with a change from not having the skill, to fully mastering it. In addition, measures of how well the model performs appear to follow well-defined scaling laws with increasing data size, increasing training time, or increasing model size. These scaling laws are important in practice because they tell us how much money we need to invest in order to get a certain amount of improved model behaviour. We derived a simple model that exhibits these phenomena, and is analytically solvable. We can therefore explain why there appears to be emergence, and why these models show scaling behaviour. |
| Exploitation Route | Because of the great commercial and academic interest in scaling behaviour in large language models, we believe that these results may be taken forward to guide the training and design of larger AI models. |
| Sectors | Digital/Communication/Information Technologies (including Software) |
| Description | Meeting at The Royal Society (London) |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | I attended and presented a poster at the 'Beyond the symbols vs signals debate' meeting at The Royal Society (London). |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://royalsociety.org/science-events-and-lectures/2024/10/symbols-vs-signals/ |
| Description | Workshop at the Institute of Physics (London) |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | I participated in the Physics for AI and AI for Physics: Landscaping Workshop at the Institute of Physics, London. |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://iop.eventsair.com/physics-for-ai-and-ai-for-physics-landscaping-workshop |