Skip to main content

Big Data and Informatics

The efforts of my lab have transformed medical image processing at Vanderbilt. Prior to 2010, research MRI scans were burned to DVD and placed on a shelf; it was the PI’s responsibility to handle all data archival, processing, collaboration, etc. Very little medical image processing was performed on Vanderbilt high performance computing facilities. Through the formation of the VUIIS Center for Computational Imaging, we created a state of the art database, processing, and collaboration facility for medical imaging.

Today, all human research scans taken at VUIIS are automatically routed to a long-term PACS archive and mirrored in an eXtensible Neuroimaging Toolkit (XNAT) server to provide secure, multi-site access to data resources. Demographics and non-imaging measures are recorded using Vanderbilt’s HIPAA compliant distributed RedCAP database and matched with the XNAT archive. This system runs on an enterprise Dell server with Windows HyperV virtualization to provide scalable on-demand performance, while the Vanderbilt ACCRE facility provides fee-for-service infrastructure for teraflop computation. Automated quality analysis pipelines have been developed and are supported using the community driven PyCAP and PyXNAT Python projects. Our PACS/XNAT system for the VUIIS Center for Human Imaging has captured 18,576 studies (from two human 3T MRI and one human 7T MRI systems). Our XNAT systems have enabled large-scale automated analysis and quality assurance/control with thousands of imaging sessions with a reasonable level of human oversight. These efforts have directly contributed to the NSF-supported expansion of the ACCRE facility with larger computation nodes to support the memory and CPU requirements of imaging and to expanded imaging informatics components of numerous NIH proposals. These capabilities are attracting high caliber collaborators from both within and outside of Vanderbilt.

Additionally, I have led the integration of imaging resources with the established clinical data reuse efforts within Synthetic Derivative and Research Derivative projects through a Vanderbilt Discovery grant. The specific aims of the research program are to (1) develop appropriate informatics representations to anonymize and link radiology data to the Synthetic Derivative project, (2) engineer appropriate software solutions to transfer data, and (3) establish large-scale data transfer from the clinical system to the anonymized system in a safe and secure manner. We have performed real-time capture and anonymization of all clinical CT and MRI data from the Radiology Department of the Vanderbilt University Medical System (VUMC) for the Synthetic Derivative’s Imaging Project. In 2016/2017, we are transitioning the informatics system from research to production support as part of the Vanderbilt ImageVU project.