Efficient Spectral Algorithms for Massive and Dynamic Graphs

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Spectral graph theory investigates the algebraic properties of matrices associated with graphs. Over the past 20 years, studies in spectral graph theory have successfully overcome fundamental bottlenecks faced by combinatorial algorithms, and have become a major research focus in theoretical computer science, machine learning, and network analysis. In particular, some recent breakthrough results show that spectral techniques can be applied to solve central optimisation and learning problems in nearly-linear time. Designing such highly efficient algorithms is crucial to cope with the emergence of massive graphs and real-time data sets coming from technological, social and biological networks.

In this fellowship I propose three research directions to advance the studies of spectral algorithms and their applications in data science: (1) I propose to advance our understanding of fundamental spectral techniques by studying the spectral properties of different matrices representing directed graphs, and exploring new connections between graphs and other mathematical objects, e.g., manifolds studied in geometry; (2) I propose to investigate new spectral algorithms for two fundamental graph problems in different settings, and further improve the state-of-the-art of the two problems with respect to the algorithms' runtime and performance; (3) spectral algorithms vastly outperform combinatorial algorithms with respect to their runtime in the worst case, however the design of most spectral algorithms usually involve many procedures, which bring the issue of numerical stability for efficient implementations. To address this I propose to develop an open-source algorithmic library for spectral algorithms so that by the end of the fellowship data scientists would be able to use the state-of-the-art algorithms for spectral sparsification and graph clustering in a black box manner.

A successful completion of the fellowship will make a significant step towards understanding the power and limits of the algebraic techniques in designing fast graph algorithms in various settings, and the performance of nearly-linear time spectral algorithms in real world data sets.

Planned Impact

The overall fellowship is to investigate foundational, algorithmic, and applied aspects of spectral methods. By the nature of the proposed work and the tight relationships among the four work packages in the proposal, the fellowship will build connections between a number of active research areas in computer science and mathematics, and the fellowship's outcome would ideally spread to all these fields.

Because of the cross-disciplinary nature of the fellowship, academic researchers in the disciplines directly involved, including algorithms, combinatorial optimisation, machine learning, and numerical linear algebra, will appreciate the technical advances of the fellowship that addresses not only the theoretical foundations but also practical applications of the spectral algorithms for the two fundamental graph problems. The fellowship will further investigate the algorithmic Kadison-Singer problem, and the relationships between graphs and manifolds, whose research outcome will have a significant impact in geometry.

During the fellowship there will be a series of pathways to impact events, including two workshops in algorithmic spectral graph theory, master course on Spectral Algorithms and Their Applications in Data Science to be held at the Alan Turing Institute, as well as the involvement of the Turing Challenge Fortnights. These events provide an ideal platform to not only report the fellowship's progress with the other experts in the area and exchange ideas with researchers in related fields, but also introduce this frontier research field to other early-career researchers and more postgraduate students in the UK.

Finally, the fellowship will produce the first open-source algorithmic library of the state-of-the-art algorithms for spectral sparsification and graph clustering. The algorithmic library's performance with respect to its runtime and numerical stability on real-world data sets will be of great value to researchers in algorithms, machine learning, and numerical linear algebra. In the long term, a large number of end users of spectral algorithms will be also greatly benefited from the algorithmic library.
 
Description - We have studied directed graph clustering and its applications for learning higher-order structures of clusters. In particular, we demonstrate that these algorithms can be used to uncover many significant patterns of clusters, and the use of our designed algorithms in analysing international trade and migration datasets. We further show that these significant patterns can be uncovered with local algorithms; the runtime of local algorithms is usually independent of the size of the underlying graph, and it's very desired for analysing massive datasets.

- Spectral clustering is one of most popular algorithms for modern graph clustering tasks, and consist of the two steps: embed the vertices of some graph into k-dimensional Euclidean space using k eigenvectors of some matrix of G, and applies k-means to partition the vertex set of an input graph into k clusters. We prove that, by applying fewer than k eigenvectors to construct the embedding, spectral clustering is able to produce better output for many practical instances; this result is the first of its kind in spectral clustering, and demonstrates that one can improve the quality of clustering outcome through some simple enumeration of eigenvectors.
Exploitation Route - The others interested in analysing higher-order information of graphs could use our designed algorithms directly.

- Our newly developed algorithm for hierarchical clustering can be used by others to better analyse the hierarchical structure of the input set occurring in practice.

- Our newly developed algorithm for finding dense components in hypergraphs could potentially be used in other research areas of computer science, including network analysis, and natural language processing.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Some of our recent works appearing at NeurIPS'20 and AISTATS'20 shows that our designed algorithms can be used to analyse international trade and various migration datasets. In addition, our work appearing at ICML'21 shows that some structures between the clusters can be uncovered with local algorithms, and the performance of our designed algorithms is showcased through the US migration dataset and the Interstate Disputes Dataset. We expect that this line of research would receive more recognition outside computer science in the near future.
First Year Of Impact 2021
Sector Other
Impact Types Economic,Policy & public services

 
Description Dynamic Graph Sparsification 
Organisation ETH Zurich
Country Switzerland 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Description Dynamic Graph Sparsification 
Organisation Rutgers University
Country United States 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Description Dynamic Graph Sparsification 
Organisation Stanford University
Country United States 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Description Dynamic Graph Sparsification 
Organisation University of California, Berkeley
Country United States 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Description Dynamic Graph Sparsification 
Organisation University of Copenhagen
Country Denmark 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Description Dynamic Graph Sparsification 
Organisation University of Michigan
Country United States 
Sector Academic/University 
PI Contribution I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year 2020
 
Title STAG: Spectral Toolkit of Algorithms for Graphs 
Description STAG is an open-source library of efficient spectral algorithms for graphs. It is the first such algorithmic library mainly written in C++, with a python wrapper around the underlying C++ library for python users. The source code of STAG can be found from GitHub page. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact We have so far implemented the local graph clustering component. Comparing with existing open-source implementation for local graph clustering, STAG connects to a Neo4j database running in the AuraDB cloud service, or a database running locally. With this feature, one can run local graph clustering on the entire Wikipedia graph of more than 20GB with a single PC. It is the first open-source implementation of local graph clustering that connects a graph database. 
URL https://staglibrary.io