Toews Matthew, Wachinger Christian, Estepar Raul San Jose, Wells William M
Inf Process Med Imaging. 2015;24:339-50. doi: 10.1007/978-3-319-19992-4_26.
This paper proposes an inference method well-suited to large sets of medical images. The method is based upon a framework where distinctive 3D scale-invariant features are indexed efficiently to identify approximate nearest-neighbor (NN) feature matches-in O (log N) computational complexity in the number of images N. It thus scales well to large data sets, in contrast to methods based on pair-wise image registration or feature matching requiring O(N) complexity. Our theoretical contribution is a density estimator based on a generative model that generalizes kernel density estimation and K-nearest neighbor (KNN) methods.. The estimator can be used for on-the-fly queries, without requiring explicit parametric models or an off-line training phase. The method is validated on a large multi-site data set of 95,000,000 features extracted from 19,000 lung CT scans. Subject-level classification identifies all images of the same subjects across the entire data set despite deformation due to breathing state, including unintentional duplicate scans. State-of-the-art performance is achieved in predicting chronic pulmonary obstructive disorder (COPD) severity across the 5-category GOLD clinical rating, with an accuracy of 89% if both exact and one-off predictions are considered correct.
本文提出了一种非常适合于大量医学图像的推理方法。该方法基于一个框架,在这个框架中,独特的3D尺度不变特征被有效地索引,以识别近似最近邻(NN)特征匹配——计算复杂度为O(log N),其中N是图像数量。因此,与基于成对图像配准或特征匹配且需要O(N)复杂度的方法相比,它能很好地扩展到大数据集。我们的理论贡献是一种基于生成模型的密度估计器,它推广了核密度估计和K近邻(KNN)方法。该估计器可用于即时查询,无需显式参数模型或离线训练阶段。该方法在一个从19000例肺部CT扫描中提取的95000000个特征的大型多站点数据集上得到了验证。受试者水平分类能够识别整个数据集中同一受试者的所有图像,尽管由于呼吸状态会产生变形,包括无意的重复扫描。在预测慢性阻塞性肺疾病(COPD)的5类GOLD临床分级严重程度方面达到了当前的先进性能,如果将精确预测和一次性预测都视为正确,则准确率为89%。