Optimal design of performance measurement experiments for complex, large-scale networks

Lead Research Organisation: Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science

Abstract

Network measurement may have social, engineering or commercial motivations. Typical social motivations include the need to understand popular usage and to answer questions about social-networking sites; other possibilities include use as part of legal evidence to illustrate the (lack of) impact of regulation upon the popularity of peer-2-peer networks. The engineering application of measurement is an integral part of the network optimisation process / both to provide a baseline of pre-optimised performance and to quantify improvements. Another area of application is as part of measurement-based algorithms for controlling network access. Commercially, measurement is critical to guaranteeing Service Level Agreements (SLAs) between network and service providers and customers. These SLAs provide enforceable guarantees on the upper bounds of packet level performance; e.g. they state that mean delay, delay variation (sometimes called jitter ) and packet loss probability will not exceed a specified value when measured over an agreed period.However a critical problem is that network traffic and topologies are highly variable and this tends to create difficulties in measuring accurately. For this reason measurement can be prone to very large errors in estimating end-to-end delay (both mean delay and jitter) and packet loss rates. Indeed the optimal measurement of packet level performance is a challenging open problem in engineering mathematics.Current measurement methods are not actually designed to provide the maximum information from the minimum data set. In this project the crucial step is to view all network measurements as numerical experiments, in which random processes are sampled, and the sampling is constrained by the resources available, e.g. bandwidth. In this way we are then able to apply the Statistical Design of Experiments (DOE) to network measurement experiments.DOE techniques have been very successfully applied in linear and static environments, mainly in biological and some industrial contexts. Most work on DOE has assumed static processes, or deals only with the static aspects of the processes, but network traffic and topologies are highly variable and nonlinear. The first work on DOE for models which are solutions of nonlinear differential equations was in the field of chemical kinetics, co-authored by a member of the Statistics Group at Queen Mary. Subsequent work at Queen Mary has further developed DOE for nonlinear models, or nonlinear functions of the parameters in a linear model. Although the models involved are fairly small-scale compared with those arising in networks, having a single input variable and no feedback, this has provided a good starting point.In parallel with the DOE thread at Queen Mary (Statistics Group and Networks Group), the University of Cambridge (Computer Lab) builds on experience in techniques for packet (or packet flow) classification. Such techniques for the classification of network traffic have previously used features derived from streams of packets. Such feature collections are often huge (200+), and can range in complexity from Fourier-Transformations and quartile statistics to mean and variance of packet inter-arrival times and the number of TCP SACK packets. Classification accuracy is often good, but with the disadvantage of complexity and cost. In this project such previous experience with lightweight application classification schemes are re-oriented towards learning the traffic characteristics that are critical in their influence on delay and loss performance. This approach to focus on actual experimental observations; a better approach than relying on simple queue models that have inbuilt (and limiting) distributions chosen mainly to allow the resulting system of equations to be solved.The combination of DOE and machine learning promises a real step towards solving the problem of the optimal measurement of packet level performance.

Publications

10 25 50
 
Description Broadband packet communications networks, exemplified by the internet, are supporting virtually all information exchange internationally. Packet-level performance (packet loss and delay) is the dominant factor controlling user experience of these networks, and as user experience is ultimately the key factor driving commercial success, the key network performance measures must be accurately monitored.



A challenging unsolved problem in packet networking has long been: how do we monitor packet level performance in an optimal fashion? This is equivalent to asking how we can ensure that the samples of loss and delay taken from a network contain the maximum amount of information (or that they have minimum variance and bias).



In this project we addressed this question. By modelling networks using Markov state models we have been able to treat network measurement and monitoring as numerical experiments. This crucial step then allowed us to optimise these measurement experiments using the Statistical Theory of the Design Of Experiments (DOE).



The major results of this project include a utility based framework for optimisation of packet network measurement, and a new approach to optimal sampling based on the maximisation of Fisher Information. This latter breakthrough led directly to the development of framework for a new science of DOE that allows the "items" in any experiment to be inter-connected. This is new, and has the potential to revolutionise the use of DOE across a wide range of domains, including agriculture, drug development, viral-marketing techniques, and social policy research as well as in communications networking.
Exploitation Route Potential use in non-academic contexts: Network and Service Providers have an urgent need to accurately monitor the performance of their networks, essentially to ensure that they are meeting the guarantees that are written into their Service Level Agreements (SLAs). Our research can be used directly to set-up pre-existing network monitoring equipment to ensure that best possible use is made of the monitoring equipment. There are two major routes to exploitation for this research: within the communications networking aspect of the digital economy, and in diverse fields (including medicine) outside the digital economy.



Communications networking (digital economy): In industry, network and service providers have an urgent need to accurately monitor the performance of their networks, essentially to ensure that they are meeting the guarantees that are written into their Service Level Agreements (SLAs). Our research can be used directly to set-up pre-existing network monitoring equipment to ensure that best possible use is made of it.

In academia, our work can have major impact on designing optimal simulation strategies for the study of broadband communications networks and connected systems more generally.



Across disciplines (outside the digital economy): this project established the theoretical foundations for the optimal measurement of the state of any physical system that can be represented by a Markov chain. There are very many systems that are well represented by Markov chains: examples include populations in biological research, and cell cultures in medical storage facilities. We believe that designing optimal monitoring strategies for cell-cultures will have significance in oncology. Medical researchers have cold-stores for cell-cultures that need to be monitored, and yet the process of monitoring degrades the samples. This is an example that our optimal measurement methodology can address as a monitoring problem in which the act of observation changes the state of the system. We have already successfully shown how to solve similar problems.
Sectors Digital/Communication/Information Technologies (including Software)