PhD Proposal: Efficient Traffic Detection, Scheduling and Transmission in High-Speed Networks

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

More and more companies build large data centers to provide online services, including web search, online gaming, etc. To support these services, multi-rooted tree topologies have been employed in high-speed networks, which utilise multiple paths between source and destination hosts to guarantee high bandwidth. In general, traffic in high-speed networks follows a heavy-tailed distribution, i.e. 10% of flows carry approximately 90% of data, while approximately 90% of them generate 10% of the overall demand. Most short flows are delay-sensitive. Scheduling these without prior discrimination will make them prone to experiencing long queuing delays and packet reordering, which increases their overall completion time and deteriorates application performance. Thus, accurately detecting heavy flows at line rate and scheduling traffic on different paths appropriately, to attain low flow competition times is crucial in large-scale high-speed networks. In addition, new transport protocols are needed to increase the utilization of link resources in such high-speed networks.
In this project, I will design a set of robust traffic engineering solutions to ameliorate the heavy flow detection accuracy problem and reduce flow competition times: (1) Data structures for highly-accurate flow detection: In current heavy flow detection approaches, flow persistence counters are naively replaced when facing memory constraints, resulting in low detection accuracy. I plan to design a new sketch scheme to replace incumbent flow indicators more carefully through a probabilistic replacement approach, while maintaining high throughput in terms of update and query operations via recording flow keys (e.g. five-tuple information). (2) Flexible flow scheduling with adaptive rerouting granularity: Current high-speed networks run under diverse paths due to traffic dynamics and link failures, which results in topology asymmetries. For flow scheduling, existing flow-level approaches map each flow to one path, and flowlet-level approaches only reroute flowlets when a new flowlet emerges. Both kinds of schemes suffer from low utilization of network resources due to their coarse switching granularity even under symmetric topologies. In contrast, fine-grained mechanisms are prone to severe packet reordering in asymmetric topologies, leading to performance degradation. To decrease flow completion time in such circumstances, I will propose a new flow scheduling mechanism that adjusts the rerouting granularity adaptively according to the real-time network status. (3) Accurate subflow adjustment for flow transmission: Traditional TCP is aimed at reliable transport in wide-area networks, relying on packet loss as the congestion signal. Vanilla TCP is inappropriate for data center networks, since it deteriorates transmission performance, particularly of short flows. This is because packets will be stored in switch buffers as long as there is no packet loss, which may result in long queues and thus large flow completion times. MPTCP uses several subflows to avert packet reordering and reduce flow completion time. However, current MPTCP designs are usually unaware of the number of subflows, leading to bandwidth under-utilization or frequent timeout events. In order to avoid long queuing delays and provide low-latency transmission for short flows while guaranteeing high throughput for long ones, I plan to design a transmission mechanism that harnesses the power of deep reinforcement learning to learn how to adjust the number of subflows flexibly according to real-time network status in high-speed networks.
Datasets: For flow detection, I will utilize anonymized IP traces from CAIDA. For flow scheduling and transmission, I will apply the public Facebook dataset.
Expected Outputs: The expected outcome will be several research papers to be submitted to IEEE INFOCOM, IEEE ICNP or ACM CoNEXT. The code developed will be published on Github.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517884/1 01/10/2020 30/09/2025
2590767 Studentship EP/T517884/1 01/09/2021 28/02/2025 Weihe Li