Random trees: analysis and applications

Lead Research Organisation: University of Bath
Department Name: Mathematical Sciences

Abstract

Trees are a mathematical object that arise naturally in various contexts: for example, in computer science, search trees are widely use in databases as an efficient way to store and retrieve data, in biology, phylogenetic trees are key in understanding the evolution of species, and in network theory, trees are meaningful, tractable models for large networks.

This applied probability project focuses on several models of random trees that arise from applications to computer science and physics. The project is divided into three distinct but inter-related parts, each of them is application-driven and involves proving fundamental mathematical results about random trees.

Part A of the project focuses on a new model of random inputs for the satisfiability problem: random Boolean trees. The importance of this problem is two-fold: fundamentally, since it is closely related to the P vs. NP problem (one of the seven Millenium problems of the Clay Mathematics Institute), and practically, as it is related to finding efficient algorithms for very common industrial problems such as hardware and software verifications.

In Part B, results about the shape (profile) of the so-called "weighted random recursive tree", are used as a first-step towards understanding more intricate models for animal and human mobility. Understanding how humans move, for example in a city, is crucial when aiming at understanding more complex phenomena such as the transmission of epidemics. The model I will study in this project was introduced by physicists and fits data collected in a study of capuchin monkeys. Proving mathematical results about this random walk will thus enhance our fundamental knowledge base around animal and human mobility.

In Part C, random trees are a model for large networks such as the internet, or social networks. Because of their enormous size (in terms of numbers of routers for the internet, or number of users in social networks), and because of the fact that they constantly evolve with time, it is impossible to keep an up-to-date map of these networks, and there is thus a need for good mathematical models. Although trees contains no "loops" (or cycles), whereas networks typically do, random trees are meaningful models for networks, and the fact that they have no cycles make them more tractable than random graphs. Part C focuses on such a model, introduced by physicists: the so-called "preferential attachment tree with fitnesses" and aims at proving results about its larger hub (node with the most links) and about its shape (or "profile").

Planned Impact

This fellowship aims at proving fundamental results about problems arising from computer science, physics and biology; the outputs of the fellowship will be mathematical theorems and their proofs increasing our understanding of these applications.

The potential economic and societal impacts, for each of the parts [A-C] of the fellowship, are:

[A] Sat-solvers, i.e. algorithms that solve the satisfiability problem, are widely used in industry, for example in the context of software or hardware verification. The theorems, proofs, and computational experiments of the fellowship will contribute to a better understanding of the satisfiability problem and this insight could eventually give new ideas towards designing more efficient sat-solvers.

[B] Understanding human mobility is the key towards understanding the spread of epidemics or cultural traits in a population of moving individuals. Part [B] of this proposal will deliver theorems about a model of reinforced random walks that was introduced by physicists as a model for animal and human roaming behaviour. The results of the fellowship and their proofs will thus extend the fundamental knowledge base around human mobility.

[C] Part C of the fellowship focuses on understanding the structure of large networks. Since networks are ubiquitous in, for example, telecommunications, social sciences, economics, and biology, the results of the fellowship have the potentiel for diverse impacts. Many of the very large networks arising from these applications have the same structure; therefore, understanding this structure in detail has countless applications: e.g. community detection in social network, virus spreading in a population seen as a network of individuals in contact, malware attacks on telecommunication networks.

Publications

10 25 50
publication icon
Björnberg J (2020) Characterising random partitions by random colouring in Electronic Communications in Probability

publication icon
Boci E (2023) Large deviation principle for a stochastic process with random reinforced relocations in Journal of Statistical Mechanics: Theory and Experiment

publication icon
Fountoulakis N (2022) Dynamical models for random simplicial complexes in The Annals of Applied Probability

publication icon
Janson S (2023) Fluctuations of balanced urns with infinitely many colours in Electronic Journal of Probability

publication icon
Kious D (2022) Finding geodesics on graphs using reinforcement learning in The Annals of Applied Probability

publication icon
Kious D (2022) The trace-reinforced ants process does not find shortest paths in Journal de l'École polytechnique - Mathématiques

publication icon
Mailler C (2021) Competing growth processes with random growth rates and random birth times in Stochastic Processes and their Applications

publication icon
Mailler C (2022) Parameterised branching processes: A functional version of Kesten & Stigum theorem in Stochastic Processes and their Applications

 
Description 1- My collaborator Gerónimo Uribe Bravo and I have completed the first rigorous study of random walks with random reinforced relocations. This model was originally introduced in the physics literature as a model for animal foraging behaviour. Our contribution is to extend the model far beyond its orignal setting and introduce a probabilistic method to prove the results conjectured by the physicists. This first study open the way for further developments: we showed general results for the typical position of the walker at large times by proving a central limit theorem, but other questions such as the behaviour of large deviations remain open.

2- Another output of this project is a collaboration with math biologists from the university of Bath, Dr Kit Yates and his PhD student Cameron Smith, on domain-growth, a key process in, e.g., embryonic development, the growth of tissue, and limb regeneration. We explain why the classical model is biased at the boundary and how to fix this problem.

3- A more fundamental output of the project, in collaboration with Jakob Björnberg (Gothenburg), Peter Mörters (Cologne) and Daniel Ueltschi (Warwick) is a paper on ``color-and-divide'' processes motivated by applications to random loops models. These processes are important models in statistical physics, in particular because of their close links to the Bose-Einstein condensation, for which a rigorous explanation is still missing.

4- A core-output of the project is an article with Denis Villemonais is now published in the Annals of Applied Probability. We show how to apply stochastic approximation methods to the analysis of the recent model of infinitely-many-colour Pólya urns. This paper unveils important links between this model and the theory of quasi-stationary distribution and its application to Monte-Carlo simulation methods.

5- Two important output on Part C of my proposal are (A) a result on the degree distribution in a model of random simplicial complexes that was introduced in the physics literature and (B) a result on the maximal degree in a model of reinforced branching processes that generalises the famous Bianconi and Barabasi model for complex networks.
The latter is now published at Random Structures and Algorithms, the second is accepted for publication at the Annals of Applied Probability.

6- With Daniel Kious (Bath) and Bruno Schapira (Marseille), we introduced a new model for ants finding shortest paths between their nest and the food using reinforcement-learning. It is well-known that ants find shortest paths using no other means of communication than the pheromones they deposit behind them: our model is the first probabilistic model for this phenomenon. In two outputs, one published at the Journal de l'Ecole Polytechnique and one accepted for publication at the Annals of Applied Probability, we show how little variations in the definition of the model can change the output drastically, showing that this model has the mathematical richness to justify further study.
Exploitation Route Outcomes 1- and 4- open the way to other studies, and in fact have already been cited in the probability and physics literature. It is in particular important to explore the applications of 4- to the rigorous analysis of Monte-Carlo simulation methods. Finally, I believe that 6- is the start of a whole new strand of research on the application of discrete probability theory to reinforcement learning.
Sectors Other