Storage, access and transmission of whole-slide images for telepathology applications

Lead Research Organisation: University of Warwick
Department Name: Computer Science

Abstract

Pathology is the branch of medicine that studies the cause, origin, and nature of diseases through the examination of tissue biopsies at a microscopic level. Pathology slides are traditionally handled by cutting a tissue sample into paper-thin sections, and staining them so to bring out regions of interest (RoIs). A pathologist places these paper-thin sections on a glass slide under a microscope in order to look for a range of features that aid in confirming the presence and malignancy level of the disease. For example, in the case of cancer biopsies, the pathologist analyses the shape, size and amount of abnormal and normal cell nuclei in the tissue to confirm the existence and progression of the tumour.

Recent advances on whole-slide digital scanners have made possible the digitization of pathology slides, allowing their storage and manipulation in digital form. The digitized versions of pathology slides, which are called virtual slides or whole-slide images (WSIs), are complementing traditional analysis techniques that rely on pathologists looking under a microscope with techniques that rely on pathologists looking at digital images on a monitor. Moreover, digitization of these slides also allows providing telepathology services by sharing WSIs and thus reaching isolated hospitals and medical centres. For example, thanks to telepathology, pathologists would be able to send WSIs electronically to others or post them on a secure web-site making them available for consultation with other pathologists. As a consequence, more pathologists may be brought into the process of making a diagnosis, thus avoiding medical errors.

Due to the high resolution required to digitize pathology slides, the resulting WSIs tend to be huge in file size, which results in heavy demands for storage and transmission resources. For example, the digitization of a single core of prostate biopsy tissue, of roughly the dimensions of a stamp, could easily result in 900 million pixels. By comparison, a photograph of 4x5 inches in size scanned at 300 dots per inch, which is the standard resolution for printing in a magazine, results in only 1.8 million pixels. So, WSIs usually require around 500 times more pixels than regular digital images. Moreover, a single pathology study normally comprises more than one biopsy sample. For example, in the case of prostate cancer studies, more than 10 biopsy samples are often required per patient, resulting in hundreds of gigabytes of imaging data per study. As a consequence, the main challenge that currently prevents telepathology from being widely used in clinical settings is the huge file size of WSIs, which makes the access and transmission of these data over different channels lengthy. Additionally, their huge file size also prevents WSIs from being widely used in current Picture Archiving and Communications Systems (PACS), which comprise a collection of software and network infrastructure used in hospitals and medical centres to store, share and display medical images. Integrating WSIs into PACS would allow pathologist to use other patient data available in PACS in order to increase the accuracy of diagnosis. Therefore, designing efficient coding methods capable of facilitating the access and transmission of WSIs for telepathology applications, while allowing integrating these data into PACS, remains a challenge. This project is mainly concerned with the design of such methods.

Planned Impact

The main impact of this project will be the improvement in the provision of pathology services to the UK and society in general. A great number of healthcare decisions affecting diagnosis or treatment currently involve a pathology investigation. This project will impact how pathologists work in collaboration with other pathologists by increasing the capacity to share WSIs. Collaborative work will result in diagnosis and treatments with improved accuracy, as pathologists will be able to consult in challenging cases and reach consensus. The increased ability to share and access WSIs will also impact the use of telepathology services to reach areas where no pathology services are available. Society will then benefit from healthcare services with improved quality and less delays. The project will also impact the integration of WSIs produced by commercially available scanners into Picture Archiving and Communications Systems (PACS), which is the network and software infrastructure currently used in every hospital trust in England to store, share, access and display medical images. This is particularly important to incorporate other patient-related data available in PACS into the diagnosis process, thus improving the accuracy of subsequent treatments.

The primary industrial impact of this project will be the development of innovative techniques for scanners manufacturers to generate WSIs in a compact format amenable for PACS. Working with GE Healthcare, the project will have an immediate impact in the pathology imaging industry. Another sector that will be positively impacted by this project is the pathology care sector. As the project improves the ability to transmit and access WSIs, it will then be possible to establish new business models where large hospitals with on-site pathologists can provide virtual laboratory services to small hospitals with no on-site pathologists.

The research sector will also benefit from this project, particularly researchers in pathology and image analysis. The project will have a direct impact in the use of computer-aided (CA) methods to assist in analysis and diagnosis, as the development and accuracy of these methods heavily depend on the ability to access and manipulate WSIs with ease by using, for example, feature extraction and segmentation techniques, which are both an integral part of the proposed project.

Publications

10 25 50
 
Description In traditional clinical and research scenarios, pathologists examine biological tissue specimens placed on glass slides using optical microscopes. In this workflow, doctors are required to be physically present to manipulate the glass slides and the microscope controls in order to make a diagnosis. In recent years, slide scanners have been developed and perfected and it is now possible to register the glass slides as digital images known as Whole-Slide Images (WSIs). Using adequate software, it is possible for doctors to navigate these WSIs and obtain a visualization experience comparable to that of physically employing glass slides and optical microscopes.
One of the main advantages of WSIs is the possibility of performing concurrent visual examinations from multiple, geographically distant locations. As a result, more specialists can be brought into the diagnosis process, which has been shown to increase its accuracy. However, WSI have massive sizes, typically requiring several Gigabytes of RAW data per image. Hence, the use of WSIs can be very demanding for the IT infrastructures of clinical and research labs. The main main goal of this project is to improve the storage, access and transmission of WSIs for telepathology applications. To achieve this, image processing and data compression are employed to produce compact representations of the WSIs, i.e., compressed files significantly smaller than the original RAW images.
Consistent with the original Work Plan submitted for the proposal of this grant, two main research lines have been developed. Both lines resulted in significant contributions to the state-of-the-art of the compression of WSIs and several supporting publications.

In the first research line, the spatial redundancy of WSIs is exploited to improve the lossless coding performance of the state-of-the-art compression algorithm HEVC. Intuitively, if a given region of the WSI is similar to neighbouring regions, it can be more efficient to code the differences between that region and its neighbours than to code the original region. However, in the reference HEVC algorithm, there are many ways of calculating these differences and choosing the optimal one can be a very time-consuming process. Our main contribution to this research line is a fast method for efficiently evaluating several difference-calculation approaches in each spatial region of the WSI, so that the best one can be employed for coding in each case. Instead of testing all possible difference-calculation methods, the morphological properties found in WSIs - e.g., the large amount of edges due to the depicted biological structures - are exploited so that only the most relevant candidates are evaluated. The proposed method, which is to be published in [1], allows for time reductions of 23.5% while still improving upon the best-performing HEVC-based methods in the literature.

In the second research line, a different type of redundancy is exploited to improve the compression of WSIs. Similarly to most colour images, WSI pixels are described in terms of three main colour channels: red, green and blue. Due to the staining processes that undergo the tissue samples before the glass slides are scanned, WSIs exhibit only a limited range of hues dependent on the stains employed in each case. Therefore, the red, green and blue channels of each pixel exhibit strong dependencies. These dependencies can be exploited using linear spectral (as opposed to spatial) transformations - also known as Multi-Component Transforms (MCTs) - to significantly improve the compression performance of WSIs.
Our main contribution in this area is a framework conceived to design optimized MCTs with better coding performance than the state of the art. More specifically, a numerical optimization algorithm is applied to iteratively improve an initial MCT so that it better exploits the inter-channel redundancy. Unlike previously existing methods, the proposed framework is tightly coupled with the compression algorithm applied after the spectral transformation (in our case, the JPEG2000 image compression standard) and does not rely on hypothesis about the input image that may not hold for the particular case of WSIs. Experimental results - to be published in [2] along with a description of the proposed framework - suggest that the proposed method yields coding performance gains of up to 5.17 dB over the Karhunen-Loeve Transform (KLT), considered one of the best-performing MCTs in the literature.

The main drawback of this approach is its significant time complexity. In order to attain average run times comparable to that of the scanning process of the WSIs, we have developed a fast approximation to the aforementioned optimization framework [3]. In the original framework, tens of candidate MCTs are evaluated on the original WSI. Due to their massive sizes, this can be a time-consuming process. In our fast approximation, the candidate MCTs are evaluated on small portions of the original image to speed up the optimization process. Several strategies for the selection of small albeit representative portions of the original image are described and their performance is compared. The best strategy yield MCTs that significantly improve upon the state of the art (by over 1.47 dB on average) with time complexity comparable to the scanning process.

References

[1] V. Sanchez, M. Hernandez-Cabronero, F. Auli-Llinas and J. Serra-Sagrista, "Fast Lossless Compression of Whole Slide Pathology Images Using HEVC Intra-prediction," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, In Press.

[2] M. Hernandez-Cabronero, F. Auli-Llinas, V. Sanchez and J. Serra-Sagrista, "Transform Optimization for the Lossy Coding of Pathology Whole-Slide Images," IEEE Data Compression Conference (DCC), 2016, In Press.

[3] M. Hernandez-Cabronero, F. Auli-Llinas, V. Sanchez and J. Serra-Sagrista, "Fast MCT Optimization for the Compression of Whole-Slide Images," Submitted to the IEEE International Conference on Image Processing (ICIP), 2016.
Exploitation Route WSIs exhibit massive sizes that are challenging for current IT infrastructures in clinical and research laboratories. Thanks to the proposed methods, much more efficient storage, access and transmission tools can be developed. Since the compressed files can be made smaller without compromising the quality of the reconstructed images, less disk space is required to store the scanned WSIs. This can have an immediate impact in the maintenance costs of the IT departments of clinics and research centres, including those of the NHS (e.g., the Warwickshire Pathology Services). Moreover, the smaller size of the compressed files also result in shorter transmission times. This not only benefits IT departments, but also makes possible the remote access to the images without requiring dedicated communication links. As a result, pathologists can more easily examine the WSIs from outside the network where the images reside, without hindering the user experience. This is specially important in rural or impoverished regions where not enough specialists may be available at a certain time and the usable bandwidth is severely limited. In this scenario, the higher compression performance can enable the immediate intervention of a pathologist, which would otherwise have taken several hours or even days to take place. Finally, the smaller disk space requirements and the faster access to the WSIs allow for the creation of databases of annotated WSIs that require only moderate amounts of resources to be maintained. These databases can then be used for research purposes (e.g., easing the development of computer-aided or human-driven diagnosis methods), as well as for formative purposes.
Even though the research project funded on this grant focuses on WSIs, it is worth noting that the aforementioned coding methods could also be useful for other types of images, including medical and natural images. Therefore, an even wider range of users can potentially benefit from the proposed methods.
Sectors Healthcare