Clinical Imaging Research Centre, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
Nanyang Junior College, Singapore, Singapore.
Neuroinformatics. 2022 Oct;20(4):1065-1075. doi: 10.1007/s12021-022-09587-2. Epub 2022 May 27.
Automated amyloid-PET image classification can support clinical assessment and increase diagnostic confidence. Three automated approaches using global cut-points derived from Receiver Operating Characteristic (ROC) analysis, machine learning (ML) algorithms with regional SUVr values, and deep learning (DL) network with 3D image input were compared under various conditions: number of training data, radiotracers, and cohorts. 276 [C]PiB and 209 [F]AV45 PET images from ADNI database and our local cohort were used. Global mean and maximum SUVr cut-points were derived using ROC analysis. 68 ML models were built using regional SUVr values and one DL network was trained with classifications of two visual assessments - manufacturer's recommendations (gray-scale) and with visually guided reference region scaling (rainbow-scale). ML-based classification achieved similarly high accuracy as ROC classification, but had better convergence between training and unseen data, with a smaller number of training data. Naïve Bayes performed the best overall among the 68 ML algorithms. Classification with maximum SUVr cut-points yielded higher accuracy than with mean SUVr cut-points, particularly for cohorts showing more focal uptake. DL networks can support the classification of definite cases accurately but performed poorly for equivocal cases. Rainbow-scale standardized image intensity scaling and improved inter-rater agreement. Gray-scale detects focal accumulation better, thus classifying more amyloid-positive scans. All three approaches generally achieved higher accuracy when trained with rainbow-scale classification. ML yielded similarly high accuracy as ROC, but with better convergence between training and unseen data, and further work may lead to even more accurate ML methods.
自动化淀粉样蛋白-PET 图像分类可以支持临床评估并提高诊断信心。在各种条件下(训练数据量、示踪剂和队列数量),比较了三种使用来自接受者操作特征(ROC)分析的全局截止值、使用区域 SUVr 值的机器学习(ML)算法和具有 3D 图像输入的深度学习(DL)网络的自动方法。使用 ADNI 数据库和我们当地队列中的 276 个 [C]PiB 和 209 个 [F]AV45 PET 图像。使用 ROC 分析得出全局平均和最大 SUVr 截止值。使用区域 SUVr 值构建了 68 个 ML 模型,并用两种视觉评估的分类(灰度级)和视觉引导参考区域缩放(彩虹级)训练了一个 DL 网络。基于 ML 的分类与 ROC 分类一样具有较高的准确性,但与未见过的数据之间的收敛性更好,所需的训练数据量更少。在 68 个 ML 算法中,朴素贝叶斯的总体性能最好。使用最大 SUVr 截止值进行分类的准确性高于使用平均 SUVr 截止值,特别是对于显示更多局灶摄取的队列。DL 网络可以准确支持明确病例的分类,但对不确定病例的分类效果不佳。彩虹级标准化图像强度缩放和提高了观察者间的一致性。灰度级更好地检测局灶性积聚,从而分类更多的淀粉样蛋白阳性扫描。当使用彩虹级分类进行训练时,所有三种方法通常都能获得更高的准确性。ML 与 ROC 一样具有较高的准确性,但与未见过的数据之间的收敛性更好,进一步的工作可能会导致更准确的 ML 方法。