Gao Riqiang, Huo Yuankai, Bao Shunxing, Tang Yucheng, Antic Sanja L, Epstein Emily S, Balar Aneri B, Deppen Steve, Paulson Alexis B, Sandler Kim L, Massion Pierre P, Landman Bennett A
Vanderbilt University, Nashville, TN 37235, USA.
Vanderbilt University Medical Center, Nashville, TN 37235, USA.
Mach Learn Med Imaging. 2019 Oct;11861:310-318. doi: 10.1007/978-3-030-32692-0_36. Epub 2019 Oct 10.
The field of lung nodule detection and cancer prediction has been rapidly developing with the support of large public data archives. Previous studies have largely focused cross-sectional (single) CT data. Herein, we consider longitudinal data. The Long Short-Term Memory (LSTM) model addresses learning with regularly spaced time points (i.e., equal temporal intervals). However, clinical imaging follows patient needs with often heterogeneous, irregular acquisitions. To model both regular and irregular longitudinal samples, we generalize the LSTM model with the Distanced LSTM (DLSTM) for temporally varied acquisitions. The DLSTM includes a Temporal Emphasis Model (TEM) that enables learning across regularly and irregularly sampled intervals. Briefly, (1) the temporal intervals between longitudinal scans are modeled explicitly, (2) temporally adjustable forget and input gates are introduced for irregular temporal sampling; and (3) the latest longitudinal scan has an additional emphasis term. We evaluate the DLSTM framework in three datasets including simulated data, 1794 National Lung Screening Trial (NLST) scans, and 1420 clinically acquired data with heterogeneous and irregular temporal accession. The experiments on the first two datasets demonstrate that our method achieves competitive performance on both simulated and regularly sampled datasets (e.g. improve LSTM from 0.6785 to 0.7085 on F1 score in NLST). In external validation of clinically and irregularly acquired data, the benchmarks achieved 0.8350 (CNN feature) and 0.8380 (LSTM) on area under the ROC curve (AUC) score, while the proposed DLSTM achieves 0.8905.
在大型公共数据档案库的支持下,肺结节检测与癌症预测领域发展迅速。以往的研究主要集中在横断面(单期)CT数据上。在此,我们考虑纵向数据。长短期记忆(LSTM)模型适用于对具有规则间隔时间点(即等时间间隔)的数据进行学习。然而,临床成像往往根据患者需求进行,采集的数据通常具有异质性且不规则。为了对规则和不规则的纵向样本进行建模,我们使用距离长短期记忆(DLSTM)模型对LSTM模型进行了推广,以处理时间上变化的采集数据。DLSTM包括一个时间加权模型(TEM),该模型能够跨规则和不规则采样间隔进行学习。简而言之,(1)明确对纵向扫描之间的时间间隔进行建模;(2)针对不规则时间采样引入了时间可调的遗忘门和输入门;(3)最新的纵向扫描有一个额外的加权项。我们在三个数据集上评估了DLSTM框架,包括模拟数据、1794例国家肺癌筛查试验(NLST)扫描数据以及1420例具有异质性和不规则时间采集的临床数据。在前两个数据集上的实验表明,我们的方法在模拟数据集和规则采样数据集上均取得了有竞争力的性能(例如,在NLST数据集中,F1分数从0.6785提高到0.7085)。在对临床和不规则采集数据的外部验证中,基准模型在ROC曲线下面积(AUC)分数上分别达到了0.8350(CNN特征)和0.8380(LSTM),而所提出的DLSTM模型达到了0.8905。