Elastic Virtual Infrastructure for Research Applications

Lead Research Organisation: University of St Andrews
Department Name: Computer Science

Abstract

This proposal is focused on enabling researchers to simply and rapidly deploy, execute and monitor scientific software on elastic cloud computing infrastructures. Current interfaces to cloud resources are relatively low level and do not allow researchers to easily benefit from the elasticity that cloud infrastructures offer. Researchers have to deal with time-consuming and often error-prone tasks such as managing access credentials, selecting instance types, managing elastic IP addresses, as well as monitoring resource usage and starting, stopping and terminating instances in response; this keeps researchers from focusing directly on their scientific research.In order to address this problem and to further the uptake of cloud computing services in research we will develop an elastic wrapper for scientific applications. The elastic wrapper will provide an abstracted gateway to cloud resources and will provide a one-stop-shop interface for researchers wanting to take advantage of cloud resources for their scientific research. It will abstract the complexities of setting up, configuring and managing cloud resources for scientific research applications and provide facilities for execution and collaboration between multiple research sites working on the same problem. The system will take care of issues such as managing resource usage using the elasticity of cloud resources as well as fault tolerance to insure against resource failure. This project will provide a pilot implementation of the elastic wrapper that will be a generic solution but specifically support two exemplar scientific applications and their usage models: Groups, Algorithms, and Programming (GAP), a free, open source system for discrete computational algebra with an emphasis on computational group theory and IDL is a commercial package for statistical and numerical analysis and visualization of scientific datasets.

Planned Impact

Whilst our pilot project will only directly engage with the researchers identified above, the findings will generalise to a larger population of GAP, IDL and Fortran users as well as complement the portfolio of existing cloud research being conducted at the St Andrews Cloud Computing Co-laboratory. We estimate the size of the GAP community within the UK to comprise over 100 researchers and GAP is known to be in serious research use in the UK at sites including St Andrews, Warwick, Aberdeen, Birmingham, QMUL, Bath, Oxford, Cambridge, Imperial College and Southampton universities. The GAP system alone has been referenced in over 800 publications, of which at least 130 have (co-) authors based at UK institutions. The GAP Forum had about 900 subscribers as of August 2010, of which about 100 are from the UK. GAP is used not only by mathematicians and computer scientists working in areas ranging from group theory to discrete optimisation and formal languages, but also by researchers from other discipline areas such as physics and chemistry. We will also be supporting other systems that support SCSCP, such as KANT , Maple , MuPAD and Macaulay2 , thereby widening the range of potential users and the potential impact on scientific achievements even further. The size of the IDL and Fortran community is difficult to assess due to its pervasive use in engineering and physical sciences as well as in other disciplines such as the natural sciences and parts of the social sciences. The modular design of the system will enable future extensions to extend the usage to other groups of users using common research tools such as MATLAB and Mathematica or open-source packages such as Octave or SciLab or a wide range of other tools. Some of these packages will also be available in HPC/HTC environments but some are not because they are used interactively or because of licensing issues. Furthermore, the pilot project would help StACC researchers investigate the deployment of real-world scientific workloads onto emerging institutional as well as commercial cloud infrastructures, enabling further research on issues such as cost modelling, scheduling, organisational aspects of the adoption of cloud services and green ICT. There will be future opportunities to explore the scope for using a wrapper approach and identify at what point scientific codes will need to be changed to make use of elastic cloud services. We also lay the groundwork for demand for elastic scientific codes and scope their need. Future Beneficiaries: We see the elastic wrapper as an enabler for the further uptake of other e-Infrastructure services as it could further reduce the barriers to resources such as HPC/HTC. For example, a researcher could start by signing up to a service such as Amazon's EC2 and working with individual instances. The elastic wrapper will allow them to make use of the elasticity and use a combination of cloud resources. Because these resources are under their direct control, it will be easier for them to inspect and debug their code during runtime, enabling rapid development cycles and, ultimately, production of code that can be deployed on HTC/HPC resources.

Publications

10 25 50
 
Description The project has produced a set of tools that allow researchers to create virtual research environments using cloud resources without having to be concerned with the details of how these resources are configured and how many resources need to be provided. We have demonstrated the usefulness of these tools for two specific applications from the areas of AstroPhysics and for computational algebra applications.
Exploitation Route Tools similar to the ones produced by ELVIRA have also been developed by companies such as RightScale. A number of open source technologies exist today that have functionality partially overlapping with the ELVIRA tools.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Tools developed in the ELVIRA project were used in the Guardian's Reading the Riots investigation that analysed the riots in 2011 in cities in England. We collaborated with the Guardian's Interactive Team on an interactive visualisation of Twitter traffic relating to the riots, which won an award in the inaugural GEN/Google Data Journalism Awards.
First Year Of Impact 2011
Sector Other
Impact Types Societal,Policy & public services