NaaS: Networks as a Service

Lead Research Organisation: University of Cambridge
Department Name: Computer Laboratory

Abstract

Cloud computing has significantly changed the IT landscape. Today it is possible for small companies or even single individuals to access virtually unlimited resources in large data centres (DCs) for running computationally demanding tasks. This has triggered the rise of "big data" applications, which operate on large amounts of data. These include traditional batch-oriented applications, such as data mining, data indexing, log collection and analysis, and scientific applications, as well as real-time stream processing, web search and advertising.

To support big data applications, parallel processing systems, such as MapReduce, adopt a partition/aggregate model: a large input data set is distributed over many servers, and each server processes a share of the data. Locally generated intermediate results must then be aggregated to obtain the final result.

An open challenge of the partition/aggregate model is that it results in high contention for network resources in DCs when a large amount of data traffic is exchanged between servers. Facebook reports that, for 26% of processing tasks, network transfers are responsible for more than 50% of the execution time. This is consistent with other studies, showing that the network is often the bottleneck in big data applications.

Improving the performance of such network-bound applications in DCs has attracted much interest from the research community. A class of solutions focuses on reducing bandwidth usage by employing overlay networks to distribute data and to perform partial aggregation. However, this requires applications to reverse-engineer the physical network topology to optimise the layout of overlay networks. Even with perfect knowledge of the physical topology, there are still fundamental inefficiencies: e.g. any logical topology with a server fan-out higher than one cannot be mapped optimally to the physical network if servers have only a single network interface.

Other proposals increase network bandwidth through more complex topologies or higher-capacity networks. New topologies and network over-provisioning, however, increase the DC operational and capital expenditures-up to 5 times according to some estimates-which directly impacts tenant costs. For example, Amazon AWS recently introduced Cluster Compute instances with full-bisection 10 Gbps bandwidth, with an hourly cost of 16 times the default.

In contrast, we argue that the problem can be solved more effectively by providing DC tenants with efficient, easy and safe control of network operations. Instead of over-provisioning, we focus on optimising network traffic by exploiting application-specific knowledge. We term this approach "network-as-a-service" (NaaS) because it allows tenants to customise the service that they receive from the network.
NaaS-enabled tenants can deploy custom routing protocols, including multicast services or anycast/incast protocols, as well as more sophisticated mechanisms, such as content-based routing and content-centric networking.

By modifying the content of packets on-path, they can efficiently implement advanced, application-specific network services, such as in-network data aggregation and smart caching. Parallel processing systems such as MapReduce would greatly benefit because data can be aggregated on-path, thus reducing execution times. Key-value stores (e.g. memcached) can improve their performance by caching popular keys within the network, which decreases latency and bandwidth usage compared to end-host-only deployments.

The NaaS model has the potential to revolutionise current cloud computing offerings by increasing the performance of tenants' applications -through efficient in-network processing- while reducing development complexity. It aims to combine distributed computation and network communication in a single, coherent abstraction, providing a significant step towards the vision of "the DC is the computer".
 
Description The objective of this research was to explore how the natural structure of computer (datacenter) networks could be exploited to improve the performance of network-centric applications. Our original approach, utilising OCaml as a source language was adapted to capitalise on some internal (Kiwi) work using C#; considerable insight was drawn from the OCaml expertise in house but some of the (3rd party) hardware generation tools were found to be too immature at the time. The project explored how a language source code base could be used as the source to describe hybrid systems; this in turn would permit network-centred designs to be programmatically shared across multiple sub-systems solving the problem that has plagued heterogeneous hardware designs. With heterogeneous hardware in datacenters now becoming common place the momentum gained by the work of this project will continue to serve as an important reference for the future.
Exploitation Route Alongside open-source tools and implementations we make widely available, we consider followup projects that attend issues of both performance and a wider suite of datacenter application implementations to be enabled by the work of NaaS to date.

Further, we have already had commercial exploitation interest - mostly in the form of enthusiasm for further development of the current prototype ideas.

The EMU system also provides the ideal platform for research and development on enhanced network-control functionality and we anticipate exploitation of the EMU system in this way in the future.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://www.naas-project.org/
 
Description Engaging with representatives of industry on the potential for developing the software outputs flIck and EMU into more-widely available products; the NetFPGA platform continues to be the leading project for high-speed networking research and prototype. Our team continues to engage with both the consumers of exotic hardware and the producers of exotic hardware as the challenge of interworking these systems continues to evolve.
First Year Of Impact 2015
Sector Digital/Communication/Information Technologies (including Software),Education
Impact Types Economic

 
Description ECH2020 INDUSTRIAL LEADERSHIP (IL) H2020-ICT-2014-1
Amount € 4,294,265 (EUR)
Organisation European Union 
Sector Public
Country European Union (EU)
Start 10/2014 
End 09/2017
 
Description ECH2020 INDUSTRIAL LEADERSHIP (IL) H2020-ICT-2014-1
Amount € 6,702,748 (EUR)
Organisation European Union 
Sector Public
Country European Union (EU)
Start 01/2015 
End 01/2018
 
Title Where Has My Time Gone? Reproduction Environment and Dataset 
Description A dataset accompanying the PAM 2017 paper "Where Has My Time Gone?". This dataset includes both the scripts used for the measurements, and the results files. 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
 
Description naas 
Organisation University of Nottingham
Country United Kingdom 
Sector Academic/University 
PI Contribution Technical expertise and problem-space knowledge
Collaborator Contribution Open Source software
Impact Early days.
Start Year 2013
 
Title Emu: Rapid Prototyping of Networking Services 
Description Emu, a new standard library for an FPGA hardware compiler that enables developers to rapidly create and deploy network functionality. Emu allows for high-performance designs without being bound to particular packet processing paradigms. Utilising the Kiwi toolchain, EMU sees a implementation on the NetFPGA-SUME platform. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Due to their performance and flexibility, FPGAs are an attractive platform for the execution of network functions. It has been a challenge for a long time though to make FPGA programming accessible to a large audience of developers. An appealing solution is to compile code from a general-purpose language to hardware using high-level synthesis. Unfortunately, current approaches to implement rich network functionality are insufficient because they lack: (i) libraries with abstractions for common network operations and data structures, (ii) bindings to the underlying "substrate" on the FPGA, and (iii) debugging and profiling support. This paper describes Emu, a new standard library for an FPGA hardware compiler that enables developers to rapidly create and deploy network functionality. Emu allows for high-performance designs without being bound to particular packet processing paradigms. Furthermore, it supports running the same programs on CPUs, in Mininet, and on FPGAs, providing a better development environment that includes advanced debugging capabilities. We demonstrate that network functions implemented using Emu have only negligible resource and performance overheads compared with natively-written hardware versions. 
URL https://github.com/NetFPGA/NetFPGA-SUME-public/wiki/Reference_emu-Contrib-Project
 
Title Open Source Network Tester 
Description The world's first open-source hardware traffic generator and capture system. OSNT is low cost: it is based on the NetFPGA platform and can be built for less than $2000. OSNT is open source: all the hardware and software designs are freely available for you to use and extend. OSNT works out of the box: it comes with some standard libraries for generating and capturing user-selected packet lenghts at different rates. OSNT is extensible: it is easy to add new features and adapt it to your needs. Our approach has demonstrated lower-cost than comparable commercial systems while achieving similar levels of precision and accuracy; all within an open-source framework extensible with new features to support new applications, while permitting validation and review of the implementation. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact This system sees wide use in commercial, research, and education scenarios across a number of UK, US and other international organisations. Notable deployments include CERN, Stanford University, Western Digital, NetApp, and Barefoot Networks. The system has seen extensions for a wide range of protocols, and been ported to higher-performance hardware platforms (OSNT-SUME on NetFPGA-SUME). 
URL http://osnt.org/