Distanced LSTM: Time-Distanced Gates in Long Short-Term Memory Models for Lung Cancer Detection

Posted by gaor2 on Sunday, October 13, 2019 in Deep Learning, Longitudinal, Lung Screening CT.

Gao, R., Huo, Y., Bao, S., Tang, Y., Antic, S.L., Epstein, E.S., Balar, A.B., Deppen, S., Paulson, A.B., Sandler, K.L. and Massion, P.P., 2019, October. Distanced LSTM: Time-Distanced Gates in Long Short-Term Memory Models for Lung Cancer Detection. In International Workshop on Machine Learning in Medical Imaging (pp. 310-318). Springer, Cham.

[Full Text]

Abstract

The field of lung nodule detection and cancer prediction has been rapidly developing with the support of large public data archives. Previous studies have largely focused cross-sectional (single) CT data. Herein, we consider longitudinal data. The Long Short-Term Memory (LSTM) model addresses learning with regularly spaced time points (i.e., equal temporal intervals). However, clinical imaging follows patient needs with often heterogeneous, irregular acquisitions. To model both regular and irregular longitudinal samples, we generalize the LSTM model with the Distanced LSTM (DLSTM) for temporally varied acquisitions. The DLSTM includes a Temporal Emphasis Model (TEM) that enables learning across regularly and irregularly sampled intervals. Briefly, (1) the temporal intervals between longitudinal scans are modeled explicitly, (2) temporally adjustable forget and input gates are introduced for irregular temporal sampling; and (3) the latest longitudinal scan has an additional emphasis term. We evaluate the DLSTM framework in three datasets including simulated data, 1794 National Lung Screening Trial (NLST) scans, and 1420 clinically acquired data with heterogeneous and irregular temporal accession. The experiments on the first two datasets demonstrate that our method achieves competitive performance on both simulated and regularly sampled datasets (e.g. improve LSTM from 0.6785 to 0.7085 on F1 score in NLST). In external validation of clinically and irregularly acquired data, the benchmarks achieved 0.8350 (CNN feature) and 0.8380 (LSTM) on area under the ROC curve (AUC) score, while the proposed DLSTM achieves 0.8905.

The framework of DLSTM (three “steps” in the example). x_t is the input data at time point t, and d_t is the time distance from the time point t to the latest time point. “F” represents the learnable DLSTM component (convolutional version in this paper). H_t and C_t are the hidden state and cell state, respectively. The input data, x_t, could be 1D, 2D, or 3D.