Algorithmic Support for Massive Scale Distributed Systems

Lead Research Organisation: University of Leeds
Department Name: Sch of Computing

Abstract

Resource scheduling in massive-scale distributed systems is the process of matching demand with supply. Demand is associated with requests for resources to execute workloads, such as jobs, tasks and applications. Typical resources in a distributed computing system include servers within a data centre cluster. A scheduler aims to achieve several goals, for example, to maximise system throughput, to minimise response time, to optimise energy usage, etc. These goals may conflict (e.g. throughput versus latency), and the scheduler needs to make a suitable compromise, depending on the user's needs and objectives.
In a data centre system with hundreds of thousands of distributed servers, its massive scale is characterised by a number of factors that contribute to the system complexity:
- the number of server nodes in the cluster, interconnections between resources and heterogeneity of resources (different types of CPUs, memories, local storages);
- the number of concurrent jobs in the system and their arrival rate;
- heterogeneity of jobs (different requirements of CPU, memory and local storage; different patterns of resource usage, long-running jobs vs short-alive jobs; urgent jobs vs jobs with loose deadlines).
The key requirement for the system is its scalability - the ability of the system to sustain the required throughput level (such as operations per second) while confining the perceptional response latencies to a level similar to a small or medium size system.
In our project, we aim to address the following challenges:
(a) scheduling at scale (to make prompt scheduling decisions at a rapid rate);
(b) resource utilisation at scale (to improve utilisation of resources while maintaining high quality of service);
(c) Quality-of-Service provision at scale (to satisfy requirements of diverse workloads).
Existing scheduling algorithms developed for practical systems are often designed largely based on empirical knowledge, experience, and best effort. Due to the lack of theoretical foundation, performance of those algorithms cannot be always guaranteed. On the other hand, scheduling algorithms proposed by the theoretical community are usually based on oversimplified abstract system models. Theoretically sound algorithms, with guaranteed accuracy and time complexity, are often impractical because system models do not reflect practical complexity of real systems, and even minor adjustments of system models towards real systems make algorithms no longer applicable.
In our project, theoretical and applied experts will consolidate efforts to conduct jointly an interdisciplinary study, overcoming the shortcomings of isolated research. Overall, our project is 1) methodologically driven, attempting to extend the applicability of the most powerful techniques of mathematical optimisation; 2) application driven, where the challenges of massive-scale distributed systems invoke new developments of scheduling methodology; and 3) practice driven, where the research direction is based on hands-on experience of distributed systems specialists.

Planned Impact

This project will address scientific and practical challenges in resource scheduling and decision making of Cloud-based cluster management systems by pulling together theoretical and applied research: 1) enabling scalable and prompt scheduling decisions in modern massive-scale distributed systems, 2) allowing intelligent and autonomous management by combining toolkits of optimisation algorithms and advanced machine learning techniques, and 3) enhancing the overall utilization and reducing the operational costs by utilizing the appropriate resource over-subscription, autonomous system failover and energy optimised models.
This project will proactively engage with academia, industry, and the general public in order to ensure that it delivers a substantial impact on UK's Digital Economy on the UK's society at large. All these will have significant impact on the society in the upcoming massive scale AI based and big data driven computing systems.
Academic impact will be delivered through
- publications in high impact journals and presentations at leading conferences,
- delivering a series of dissemination and training workshops organised at the University of Leeds, at the premises of our project partners, Alibaba Group (China), and Edgetic Ltd. (Leeds), and under the umbrella of the Data Centre Alliance,
- making available the source codes of our software prototypes,
- creating and maintaining the project web site where we will systematically present our findings, publish working papers, preprints, presentation slides, maintain the library of software products and benchmarks used in our experiments.
Industrial impact will be delivered through
- close collaboration with British and international Cloud Data Centre industries (Edgetic Ltd, Alibaba Group and the Data Centre Alliance which joins over 120 companies across Europe and over 20 companies in China),
- publishing peer-reviewed industrial case studies verified by our project's industrial partners,
- publishing project findings on the project web-site and through the partners' extensive professional networks, including the Data Central Portal maintained by the Data Centre Alliance,
- organising project-themed workshop and tutorials at international conferences and within professional institutions such as IEEE and the British Computer Society.
Economic and societal engagement will be achieved through
- knowledge transfer and commercialisation of research results,
- organising new skills training courses for Cloud and AI professionals,
- making steps towards creating a future Doctoral Training Centre in Big Data and Intelligent Manufacturing at the University of Leeds, a university's spin-out company, and enhancing the partnerships with UK's and international companies in the field.

Publications

10 25 50

publication icon
Erlebach T (2023) Parameterised temporal exploration problems in Journal of Computer and System Sciences

publication icon
Erlebach T (2022) Exploration of k-edge-deficient temporal graphs in Acta Informatica

publication icon
Hei Y (2021) Hawk: Rapid Android Malware Detection Through Heterogeneous Graph Attention Networks. in IEEE transactions on neural networks and learning systems

publication icon
Kumar S (2023) DebtCom: Technical Debt-Aware Service Recomposition in SaaS Cloud in IEEE Transactions on Services Computing

publication icon
Li T (2023) BisSiam: Bispectrum Siamese Network Based Contrastive Learning for UAV Anomaly Detection in IEEE Transactions on Knowledge and Data Engineering

publication icon
Mommessin C (2023) Affinity-aware resource provisioning for long-running applications in shared clusters in Journal of Parallel and Distributed Computing

publication icon
Peng H (2021) Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks in ACM Transactions on Knowledge Discovery from Data

publication icon
Peng H (2021) Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks in ACM Transactions on Information Systems

publication icon
Song Y (2021) Joint optimization of cache placement and request routing in unreliable networks in Journal of Parallel and Distributed Computing

publication icon
Yang R (2020) Performance-Aware Speculative Resource Oversubscription for Large-Scale Clusters in IEEE Transactions on Parallel and Distributed Systems

publication icon
Zhu J (2022) QoS-Aware Co-Scheduling for Distributed Long-Running Applications on Shared Clusters in IEEE Transactions on Parallel and Distributed Systems

 
Description Since the start of the project we have collected system requirements (based on Alibaba datasets and latest trends in modern large scale providers of computing resources), formulated formal models of the modern systems and made significant steps towards addressing the key challenges.
Our main focus is on workload co-location, balancing cluster utilisation and applications' quality of service (QoS), taking into account an important type of workloads - cloud based Long Running Applications (LRAs). Microservice architecture advances the manifestation of distributed LRAs (DLRAs), comprising multiple interconnected microservices that are executed in long-lived distributed containers and serve massive user requests. This particularly boosts the requirement for strict QoS management when diverse workloads are mixed.
Our research on workload co-location and co-scheduling by far indicates the effectiveness of holistic QoS management, including detection, containment, mitigation and prediction of the QoS violation in the event of the network uncertainties and latency propagation across dependent microservices. Further challenges include interference-aware job scheduling and fine-grained resource management in uncertain and extra-dynamic environments.

In terms of content and cache management in edge computing environments, there exists a practical need for incentivizing content providers to cache contents at distributed network edges closer to end users. However, this is a particularly challenging problem due to system environments that are uncertain, content placements that couple adjacent time slots, and economic properties that are desired but hard to ensure. Our initial study indicates that devising online auction based mechanisms can incentivize content providers to cache contents continuously in distributed edge caches while enabling the edge network operators to determine bid selection, request dispatching, cache updating, and edge infrastructure provisioning to adapt to dynamic and uncertain environments.

Our research on traffic management for data streaming applications in edge computing environments indicates the importance of handling IoT traffic with different levels of sensitivity and criticality by satisfying the application-specific latency constraints. This question arises in the practical deployment of edge computing, where user data can arrive at a much faster rate than that they can be processed by an edge node. Addressing this question is critical for meeting the latency requirement for latency-sensitive applications. It is imperative to dynamically allocate the uplink bandwidth according to the latency constraint of the application, by giving higher priority to latency-sensitive applications, based on the node's forwarding capacity and the requirements of input data streams. A holistic traffic orchestrator can be helpful to avoid oversubscribing edge nodes and improve the overall system throughput.

In terms of the theoretical research, we have studied more narrow models and performed their in-depth analysis. Having established links to the enhanced versions of the classical scheduling problems with parallel machines and with the bin packing problem, we developed new models (with conflict and affinities) typical for modern massive scale distributed systems. The state-of-the-art knowledge on Vector Bin Packing has been systemised and enriched with further developments. The library Vectorpack has been created, made publicly available and successfully adopted for scheduling in distributed systems with conflicts and affinities.
Exploitation Route We expect that our future findings will
- enhance the body of research on algorithms and systems for optimising resource usage in large-scale computing systems,
- serve as the basis for the development of such systems for actual resource providers.
Sectors Digital/Communication/Information Technologies (including Software),Energy

 
Description Alibaba Scheduling team and Alibaba Damo Academy 
Organisation Alibaba Group
Country China 
Sector Private 
PI Contribution The members of the project team Dr. Renyu Yang and Prof. Jie Xu collaborate with Alibaba Scheduling team and Alibaba Damo Academy on the key problems associated with project: - computing resource over-selling, - QoS-aware workload colocation, - GPU parallelism and deep learning acceleration, - agent-based resource inference and performance prediction, - computing resources capacity planning.
Collaborator Contribution - Shaping research problems and collaborative research - Providing servers for experimental work and assistance with computational experiments - Finalising research findings in journal and conference publications
Impact Joint papers - "Performance-aware Speculative Resource Oversubscription for Large-scale Clusters" published in IEEE TPDS - "QoS-Aware Co-Scheduling for Distributed Long-Running Applications on Shared Clusters" submitted to IEEE TPDS - "STRONGHOLD: Fast and Affordable Billion-scale Deep Learning Model Training" (under submission) - "Perph: A Workload Co-location Agent with Online Performance Prediction and Resource Inference" published in IEEE/ACM CCGRID - "An Affinity-Aware Capacity Planning for Scheduling Long-Running Applications in Shared Cluster"
Start Year 2020
 
Description Beihang University 
Organisation Beihang University
Department School of Computer Science & Engineering
Country China 
Sector Academic/University 
PI Contribution Our project team member, Dr. Renyu Yang and Prof. Jie Xu, constantly collaborates with the Cloud system team at Beihang (led by Prof. Chunming Hu, Prof. Tianyu Wo) and AI team (Prof. Jianxin Li and Dr. Hao Peng). The collaborative research spans over a broad range of problems related to our project: resource oversubscription, QoS-aware co-scheduling in shared clusters, cache management in edge computing, deep learning models for scheduling systems and fault tolerance, resource management optimisation for deep learning models
Collaborator Contribution Joint research on the topics outlined in the previous field. Additionally, the Beihang team provide computing facilities (servers) and assists in performing computational experiments.
Impact A number of papers which are included in the publication list, co-authored by the Beihang colleagues: Incentivizing Online Content Caching in Distributed Edge Networks via Auction-Based Subsidization. Published in IEEE SECON 2022 Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks. ACM Trans. Inf. Syst. 40(4): 69:1-69:46 (2022) Hawk: Reliable Android Malware Detection through Heterogeneous Graph Neural Network. IEEE Trans. on Neural Networks and Learning Systems (TNNLS) 2021 Joint Optimization of Cache Placement and Request Routing in Unreliable Networks. Journal of Parallel and Distributed Computing (JPDC), 2021 LIME: Low-Cost Incremental Learning for Dynamic Heterogeneous Information Networks. Transactions on Computers (TC), 2022 Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks, ACM TKDD, 2021 OWL: Detecting Anomalous Social Network Users by Exploring User Relevance, submitted to Information Sciences (under review) Janus: Latency-Aware Traffic Scheduling for Data Streaming in Edge Computing, submitted to IEEE Trans. on Services Computing (under review) Y. Song, L. Jiao, R. Yang, T. Wo, J. Xu. Incentivizing Online Edge Caching with Edge Provisioning via Auctions. Submitted to IEEE TMC 2023 (under review)
Start Year 2010
 
Description Collaboration with Prof. Denis Trystram and Vincent Fagnon 
Organisation Laboratoire d'Informatique de Grenoble 
Country France 
Sector Academic/University 
PI Contribution The member of our team, Dr. Clement Mommessin, collaborates with Prof. Denis Trystram, Vincent Fagnon and Giorgio Lucarelli on mutli-agent scheduling
Collaborator Contribution The team has successfully completed two pieces of work "Short-Term Ambient Temperature Forecasting for Smart Heaters" and "Two-Agent Scheduling with Resource Augmentation on Multiple Machines".
Impact https://doi.org/10.1007/978-3-031-12597-3_16; https://doi.org/10.1109/ISCC53001.2021.9631550
Start Year 2021
 
Description Collaboration with the Batsim team (the group around Millian Poquet) 
Organisation Inria Grenoble - Rhône-Alpes research centre
Country France 
Sector Public 
PI Contribution The team-member of our project, Clement Mommessin, takes part at the final stages of the development of a simulator BatSim, a scientific simulator to analyze batch schedulers. On the 16th of February, Clement Mommessin made a presentation at the seminar organised by the DataMove (Data Aware Large Scale Computing) research team: "Affinity-Aware Capacity Planning for Scheduling Long-Running Applications in Shared Clusters"
Collaborator Contribution The team works on a new version release of the Batsim simulator and preparing a paper to showcase this new release, including the detailed state of art of simulators, and experimental comparisons against competitors.
Impact The two major outputs is (1) the simulator and (2) the paper like "Batsim: Infrastructure Simulator for Jobs and I/O Scheduling" is under preparation.
Start Year 2021
 
Description Collaborative Research with Dr. Akiyoshi Shioura (Japan) and Prof. Vitaly A. Strusevich (Greenwich) 
Organisation Tokyo Institute of Technology
Country Japan 
Sector Academic/University 
PI Contribution The collaboration is of interdisciplinary nature: the main area of expertise of the UK research team (Dr. Shakhlevich and Prof. Strusevich) is Scheduling Theory, while the Japanese collaborator Dr. Akiyoshi Shioura is internationally leading researcher in Submodular Optimisation. After the project was completed, Dr. Shioura keeps arranging annual visits to our team. We have productive meetings and continue working in-between the meetings via email. As an outcome, our collaborative research portfolio is expanding. Prof. Strusevich is retired from 1 January 2021, but he is actively involved in research. We have two papers completed, one has been recently accepted.
Collaborator Contribution The advantages of the Submodular Optimisation techniques are not fully acknowledged by the general scheduling community and often overlooked. The contribution of Dr. Shioura had led to advancing Submodular Optimisation methods in application to scheduling, resulting in a new methodology for solving scheduling problems via Submodular Optimisation.
Impact Preemptive Scheduling of Parallel Jobs of Two Sizes with Controllable Processing Times. Journal of Scheduling (accepted)
Start Year 2006
 
Description Collaborative Research with Dr. Akiyoshi Shioura (Japan) and Prof. Vitaly A. Strusevich (Greenwich) 
Organisation University of Greenwich
Country United Kingdom 
Sector Academic/University 
PI Contribution The collaboration is of interdisciplinary nature: the main area of expertise of the UK research team (Dr. Shakhlevich and Prof. Strusevich) is Scheduling Theory, while the Japanese collaborator Dr. Akiyoshi Shioura is internationally leading researcher in Submodular Optimisation. After the project was completed, Dr. Shioura keeps arranging annual visits to our team. We have productive meetings and continue working in-between the meetings via email. As an outcome, our collaborative research portfolio is expanding. Prof. Strusevich is retired from 1 January 2021, but he is actively involved in research. We have two papers completed, one has been recently accepted.
Collaborator Contribution The advantages of the Submodular Optimisation techniques are not fully acknowledged by the general scheduling community and often overlooked. The contribution of Dr. Shioura had led to advancing Submodular Optimisation methods in application to scheduling, resulting in a new methodology for solving scheduling problems via Submodular Optimisation.
Impact Preemptive Scheduling of Parallel Jobs of Two Sizes with Controllable Processing Times. Journal of Scheduling (accepted)
Start Year 2006
 
Description Collaborative Research with Prof. Thomas Erlebach (Durham) 
Organisation Durham University
Country United Kingdom 
Sector Academic/University 
PI Contribution Prof. Thomas Erlebach had joined the project team to replace Prof. Vitaly Strusevich (co-I, retired). Our joint work provides the theoretical foundation for algorithmic work of the team.
Collaborator Contribution Complimentary study of theoretical aspects of the applied problems under study.
Impact Research paper under preparation: "Classification and Evaluation of the Algorithms for Vector Bin Packing" (joint Clement Mommesin). https://doi.org/10.1007/s00236-022-00421-5
Start Year 2020
 
Description Collaborative research with Prof. Alexander Kononov 
Organisation Russian Academy of Sciences
Department Sobolev Institute of Mathematics
Country Russian Federation 
Sector Academic/University 
PI Contribution Uncertainty plays an important role in real-world scheduling problems. Our joint work with Prof. Kononov is aimed at addressing the issues of robustness and resiliency in flow shop and open shop systems.
Collaborator Contribution The work on flow-shop scheduling problems under uncertainty is completed. We are finalising our findings on the open shop problem under the uncertainty.
Impact Work in progress; no output yet.
Start Year 2020
 
Description Edgetic 
Organisation Edgetic Ltd
Country United Kingdom 
Sector Private 
PI Contribution The members of the project team, Prof. Jie Xu and Dr. Renyu Yang, participate in early technical discussion with the Edgetic staff on container management and Kubernetes scheduling.
Collaborator Contribution Shaping the problem on container management and Kubernetes scheduling and its preliminary analysis.
Impact The work is ongoing. No output yet.
Start Year 2019
 
Description Partnership with Alibaba Group 
Organisation Alibaba Group
Country China 
Sector Private 
PI Contribution Work in progress; no output yet.
Collaborator Contribution We are at the initial stage of collaboration, collecting requirements for the resource management system and developing a system model.
Impact We are at the initial stage of collaboration. Once we achieve our first milestone in the development of the resource management system, testing, evaluation and fine-tuning will be performed jointly with our partner.
Start Year 2020
 
Description Partnership with Edgetic Ltd 
Organisation Edgetic Ltd
Country United Kingdom 
Sector Private 
PI Contribution Work in progress; no output yet.
Collaborator Contribution We are at the initial stage of collaboration, collecting requirements for the resource management system and developing a system model.
Impact We are at the initial stage of collaboration. Once we achieve our first milestone in the development of the resource management system, testing, evaluation and fine-tuning will be performed jointly with our partner.
Start Year 2020
 
Title HAWK 
Description Open source framework for malware detection through heterogeneous graph attention networks 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact The accompanying paper has attracted 26 citations according to Googles Scholar. The software usage is harder to evaluate. 
URL https://github.com/RingBDStack/HAWK
 
Title Vectorpack 
Description C++ library of optimisation algorithms for the Vector Bin Packing problem 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact The library is being used by the students of the University of Leeds working on their UG and PG projects. At least one student at the Poznan University of Technology works on the project linked to Vectorpack. 
URL https://github.com/Vectorpack/Vectorpack_cpp
 
Description Collaboration meet-ups with industrial partners 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Prof. Jie Xu and Dr. Renyu Yang had technical discussions and meetings with a number of industrial partners with the aim of developing future opportunities for joint project proposals and collaborative work.
The companies are Samsung SAIT (UK), Brandon Medical (UK), KPMG (UK), Kuaishou (China), Zhejiang Telecom (China), Digital Farm Initiative (UK-US)
Year(s) Of Engagement Activity 2021,2022
 
Description Government engagement 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Prof. Jie Xu is an UKCRC Executive Member for two terms, responsible for international activities. He also serves as the EDI representative (Equality Diversion and Inclusion).
Year(s) Of Engagement Activity 2020,2021,2022
 
Description Invited talks and keynote speaches 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talks by Prof. Jie Xu (keynote speech at ISKE 2021, Dec 2021), Dr Renyu Yang (invited talk to Newcastle University, Mar 2021; Beihang University, April 2021; Zhejiang University of Technology, July 2021)
Year(s) Of Engagement Activity 2021
 
Description Reachout within Alan Turing Institute (Prof. Jie Xu and Dr. Renyu Yang) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Attended the workshop organised the Alan Turing Institute to develop new collaboration opportunities under the Turing Fellow programme and under the Turing Postdoc enrichment programme.
Year(s) Of Engagement Activity 2021,2022