📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Phenomenology of Deep Learning

Lead Research Organisation: University of Oxford
Department Name: Oxford Physics

Abstract

Recent Machine Learning (ML) breakthroughs in industry and the sciences rely on neural network architectures with multiple layers, a
branch of ML called Deep Learning (DL). Over the last decade, much of this progress was possible thanks to the impressive growth of
the size of datasets and neural networks. In contrast, the understanding of the foundations of DL has not followed this successful
trend. In fact, there is an enormous gap between the practical success of DL and our understanding of why DL works so well. It is
widely acknowledged that to expand the scope of ML applications and obtain reliable artificial intelligence systems, we must achieve
a fundamental understanding of DL.
A physics-based framework can provide a unique perspective to pressing questions in DL and contribute to filling the gap between
practical developments and theoretical foundations. We stress that, while ML applications are broadly used in physics, the flow of
ideas in the opposite direction, i.e., the use of concepts and techniques from theoretical physics to understand modern deep learning,
has only started to be explored.
In this MSCA, we will exploit and capitalize on the striking similarities between deep neural networks and effective field theory in
physics. The goal of "Phenomenology of Deep Learning" (PHENO-DL) is to contribute to the development of an effective theory of
deep learning. We will make use of physics model-building tools to attack foundational questions (e.g., how do deep neural networks
generalize?) and remarkable phenomena in deep learning (namely, double descent and adversarial examples). To this end, we will
adopt a setup inspired by established methods from theoretical physics, which have been recently applied to neural networks. In
particular, to explain the neural network's expressivity and capacity, we will use the interplay between the renormalisation group in
physics and the 'hierarchy of features' in the different layers in a deep neural network.
 
Description A big mystery in modern deep learning systems such as large language models is their behaviour with increasing amounts of data or compute time. New skills appear to suddenly emerge, with a change from not having the skill, to fully mastering it. In addition, measures of how well the model performs appear to follow well-defined scaling laws with increasing data size, increasing training time, or increasing model size. These scaling laws are important in practice because they tell us how much money we need to invest in order to get a certain amount of improved model behaviour.

We derived a simple model that exhibits these phenomena, and is analytically solvable. We can therefore explain why there appears to be emergence, and why these models show scaling behaviour.
Exploitation Route Because of the great commercial and academic interest in scaling behaviour in large language models, we believe that these results may be taken forward to guide the training and design of larger AI models.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Meeting at The Royal Society (London) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact I attended and presented a poster at the 'Beyond the symbols vs signals debate' meeting at The Royal Society (London).
Year(s) Of Engagement Activity 2024
URL https://royalsociety.org/science-events-and-lectures/2024/10/symbols-vs-signals/
 
Description Workshop at the Institute of Physics (London) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I participated in the Physics for AI and AI for Physics: Landscaping Workshop at the Institute of Physics, London.
Year(s) Of Engagement Activity 2024
URL https://iop.eventsair.com/physics-for-ai-and-ai-for-physics-landscaping-workshop