Fault-tolerance for massive-scale distributed systems

Lead Research Organisation: Lancaster University
Department Name: Computing & Communications

Abstract

Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centres that provide those services. The use of cloud computing has rapidly increased over the last decade with the rise of internet corporations such as Amazon Inc. and Google LLC. These infrastructures work on in a pay-as-you-go manner with companies providing Infrastructure as a Service (IaaS) allowing clients to set up and customise execution environments to their applications needs.
As the speed and power of computer hardware increases, video game developers are still pushing the limits of what this hardware can execute. However, along with this increase in power comes an increase in price and often players cannot afford to build a system capable of running the games on the highest settings. This is where cloud computing can be combined with gaming to provide on-demand game streaming. The current approaches, such as Outatime and Nvidia Grid, allow for full games to be distributed across multiple cloud servers with the results being streamed back to the end-user. With this setup the user is no longer required to own a powerful machine so that they can run the game. As the client is now required to rent games from the cloud providers it can still be expensive due to the hardware requirements of the servers increasing the rent price.
Online multi-player gaming has also become dominant in recent years, with seven of the top ten most played games on the popular PC platform 'Steam' being either mostly online or fully online. With online multiplayer games a lot of the interactive simulation must be performed both server side and client side due to synchronisation dependencies. With streamed online games this duplicated computation can be mitigated entirely as the game state is entirely on the servers which can communicate between themselves only requiring input from the clients.
Instead of distributing entire games this research will investigate the viability of distributing computationally heavy sections of the game engine in a 'cloud bursting' style to keep the game performant under heavy load. This will lower the resource requirements for the servers as sections of the engine will still be computed by the client if their machine is powerful enough to do so.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509504/1 01/10/2016 30/09/2021
2141884 Studentship EP/N509504/1 01/10/2018 03/03/2022 James Bulman