Skip to main content

Towards Portable Large-Scale Image Processing with High-Performance Computing

Posted by on Tuesday, May 8, 2018 in Big Data, Informatics / Big Data.

Yuankai Huo, Justin Blaber, Stephen M. Damon, Brian D. Boyd, Shunxing Bao, Prasanna Parvathaneni, Camilo Bermudez Noguera, Shikha Chaganti, Vishwesh Nath, Greer M. Jasmine, Ilwoo Lyu, William R. French, Allen T. Newton, Baxter P. Rogers, Bennett A. Landman. “Towards Portable Large-Scale Image Processing with High-Performance Computing”. Journal of Digital Imaging. (2018): 1-11.

Open Access Download

Abstract

High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called “spiders.” The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software development and expansion, and (3) scalable spider deployment compatible with HPC clusters and local workstations.

Keywords: Containerized XNAT DAX VUIIS Large-scale Portable

The framework of the portable implementation of VUIIS CCI medical image storage and processing infrastructure (VUIIS XNAT + DAX + spiders). The input images in VUIIS XNAT are acquired from scanners, local workstations, and Internet remote access. REDCap provides the non-imaging database as well as the processing commands to trigger the Spiders using DAX. Then, the final results are achieved as PDF format files, which are used for quality assurance purposes