Intuitive Large-scale Image Processing for Biologists

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Modern cell and developmental biology and the now-established domain of systems biology use quantitative imaging methods to measure the location, dynamics and interaction of molecules in fixed and living cells, and at increasingly high spatial and temporal resolution. Quantitative imaging depends on the development, delivery, and use of sophisticated image processing and analysis algorithms. The availability of these data analysis tools is commonly cited as a major bottleneck in scientific discovery. Previously, the absence of common interfaces and defined standards for data structures hindered the sharing of new analysis methods and the open, shared access to image datasets. Moreover the sheer computational cost of running complex algorithms on large datasets demands access to compute facilities that, while existing, are not accessible via standardised, intuitive tools for most bench biologists. This project combines developments in the OMERO application developed by the Open Microscopy Consortium led by Prof J Swedlow and the Rapid portlet development tool developed in Dr J I van Hemert's laboratory at the UK's National e-Science Centre. The resource generated will be a service with an intuitive user interface that enables bench biologists to access high performance computing resources for processing and analysing their multi-dimensional images of cells and tissues. We do not propose to develop a single stand-alone resource under this project but to provide a vital service for bench biologists, based on world-leading work performed in the UK that uses common, standardised interfaces and established principles in usability to provide access to cutting-edge image analysis methods for bench biologists. The resource will be released as a component of the open-source OMERO software suite that is currently either in testing or in daily use at most imaging sites in the UK and over 1200 sites worldwide. A stable version of Rapid will be bundled in these releases under the same license. We will build this service on top of the Edinburgh Compute Data Facility and the National Grid Service to provide the underlying e-Infrastructure.

Technical Summary

Rapid is a tool aimed at quickly designing and delivering user interfaces for applications that need access to remote compute resources. This need may arise because of the large amount of computing required or simply because the applications must execute on specific platforms. It enables these applications by generating a customised interface allowing a computational task to be performed without referring to the terminology of the underlying computational infrastructure. Its aim is to make submitting remote compute jobs as easy as booking a flight or ordering a book on the Internet. Rapid provides a XML specification that is a light-weight extension of both the Job Submission Description Language (JSDL) and the eXtended Hyper-Text Markup Language (XHTML). It provides a description of the User Interface (UI), the resources available and the task-flow. We will extend the specification to add methods specifically dealing with OMERO servers, staging and uploading data back to a server. We also will extend the specification of the UI to allow more dynamic behaviour than is currently available in Rapid. OMERO will be updated to allow reading of task descriptions in the JSDL specification and using the OMERO.Rapid job submission/UI API and be able to submit these tasks to remote computing resources. For image processing, the community requires a shared resource for processing and analysis algorithms written against the OMERO API. OMERO already uses a central repository for user comments, bug tracking, upgrade checks, and submission of user files that are either useful examples or causing users problems. This site will be updated to hold a repository of OMERO.Rapid Tasks that can be through the OMERO API and then run on remote resources using Rapid's facilities. We will provide a full schema for the specification of OMERO.Rapid Tasks that will be published and maintained alongside the well-established OME-XML.

Planned Impact

Our goal is to change the way bench biologists use imaging as a scientific discovery and assay tool. As image datasets grow in size and complexity and analysis and processing tools grow in sophistication, the need for general access to high performance computing will only grow. To date, experimental biologists have rarely made use of central computing resources because interfaces have been designed for scientists familiar with command line interfaces. The need for access to these resources certainly exists-iterative deconvolution is often used in live cell imaging, requiring many hours to days to process a single timelapse sequence. New high resolution imaging methods like PALM, STORM and 3D Structured Illumination require substantial processing of raw data just to deliver the first image a biologist can use. Our existing projects, OME and Rapid, have already made important contributions to the data management and processing challenges of life scientists. In this project, we will build the tools that provide links from easy-to-use, biologist-friendly interfaces to remotely located, high performance computing resources. We will also build a repository of standard compute-intensive processing tasks that any biologist can access and run on the resource of their choice. Our projects have established track records of producing useful, usable tools for bench biologists. The output of this project will be available for installation and usage by anyone. Given our established record of installations, we can confidently predict that many hundreds to thousands of scientists worldwide will use the output of this project, and in so doing, gain access to a repository of image processing tools and the resources to process large datasets or run computationally challenging algorithms from their desktop. Thus, our work can be expected to be used by most cell biologists in the UK, and very likely throughout the world.

Publications

10 25 50

publication icon
Morrison CA, Robertson N, Turner A, Van Hemert J, Koetsier J (2010) Handbook of Inorganic Compounds

publication icon
Van Hemert J (2010) Generating web-based user interfaces for computational science in Concurrency and Computation: Practice and Experience

 
Description We developed a web-portal that allows running of long and computer-heavy image processing jobs where the input images are taken straight out of OMERO, an existing image-management system for microscopy.

The results of the processing are placed back into OMERO and are directly viewable and ready for further processing.
Exploitation Route Rapid is developed under a GPL v3 license. All the code, examples of use, video and written tutorials, screencasts of webportals in action are available freely on a website. This allows others to pick up the tool, learn how to use it and apply it to their own domain.
Sectors Chemicals,Environment,Healthcare,Manufacturing, including Industrial Biotechology

URL http://research.nesc.ac.uk/rapid/
 
Description Successful tests were undertaken by the OMERO team to use Rapid as the friendly user interface to setup and control long-running image processing jobs on various compute clusters.
First Year Of Impact 2013
 
Title Rapid 
Description Rapid is a cost-effective and efficient way of designing and delivering portal interfaces to tasks that require remote compute resources. The aim of Rapid is to make completing these tasks as simple as purchasing a book or booking a ?ight on the web. The philosophy of Rapid is to deliver customised graphical user interfaces that enable domain specialists to achieve their tasks. These tasks make use of domain-speci?c applications that run on remote compute resources; a requirement which is satis?ed by translating the task into one or several computational jobs to be performed on Grid and Cloud Computing infrastructures, and High-Performance Computing facilities. Customised interfaces allow tasks to be performed without referring to terminology about the underlying computational infrastructure. Moreover, the system allows to expose particular features of applications as not to overwhelm the user. 
Type Of Material Improvements to research infrastructure 
Year Produced 2009 
Provided To Others? Yes  
Impact Several research groups around the world routinely use Rapid as the interface to perform their computational and data intensive tasks. It is used in Seismology analyses, Earth quake prediction, Brain imaging and Microscopy. The Chemistry department at Edinburgh uses it to facilitate teaching computational chemistry tools to 150 students each year. 
URL http://research.nesc.ac.uk/rapid
 
Title Rapid 
Description Rapid is a cost-effective and efficient way of designing and delivering portal interfaces to tasks that require remote compute resources. The aim of Rapid is to make completing these tasks as simple as purchasing a book or booking a ?ight on the web. The philosophy of Rapid is to deliver customised graphical user interfaces that enable domain specialists to achieve their tasks. These tasks make use of domain-speci?c applications that run on remote compute resources; a requirement which is satis?ed by translating the task into one or several computational jobs to be performed on Grid and Cloud Computing infrastructures, and High-Performance Computing facilities. Customised interfaces allow tasks to be performed without referring to terminology about the underlying computational infrastructure. Moreover, the system allows to expose particular features of applications as not to overwhelm the user. 
Type Of Technology Grid Application 
Year Produced 2010 
Impact Further development of Rapid and the application of Rapid to seismology, earthquake prediction, brain imaging, microscopy and computational chemistry was funded by EPSRC, BBSRC, NERC, JISC, ENGAGE (JISC) and OMII-UK (EPSRC). The software has been used subsequently by other groups to develop portals for other communities including proteomics and gene sequencing. It is released as Open Source to ensure longevity of the portals in these communities. 
URL https://sourceforge.net/projects/rapidportlet/