S.H. Ho Urology Centre, Department of Surgery, The Chinese University of Hong Kong, Hong Kong, China.
Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong, China.
Prostate Cancer Prostatic Dis. 2022 Apr;25(4):672-676. doi: 10.1038/s41391-021-00429-x. Epub 2021 Jul 15.
To investigate the value of machine learning(ML) in enhancing prostate cancer(PCa) diagnosis.
Consecutive systematic prostate biopsies performed from Jan 2003-June 2017 were used as the training cohort, and prospective biopsies performed from July 2017-November 2019 were used as validation cohort. Men were included if PSA was 0.4-50 ng/mL, and information of digital rectal examination (DRE), Transrectal ultrasound(TRUS) prostate volume, TRUS abnormality were known. Clinically significant PCa(csPCa) was defined as Gleason 3 + 4 or above cancers. Area-under-curve (AUC) of receiver-operating characteristics (ROC) was compared between PSA, PSA density, European Randomized Study of Screening for Prostate Cancer (ERSPC) risk calculator (ERSPC-RC), and various ML techniques using PSA, DRE and TRUS information. ML techniques used included XGBoost, LightGBM, Catboost, Support vector machine (SVM), Logistic regression (LR), and Random Forest (RF), where cost sensitive learning was applied.
Training and validation cohorts included 3881 and 778 consecutive men, respectively. RF model performed better than other ML techniques and PSA, PSA density and ERSPC-RC for prediction of PCa or csPCa in the validation cohort. In csPCa prediction, AUC of PSA, PSA density, ERSPC-RC and RF was 0.71, 0.80, 0.83 and 0.88 respectively. At 90-95% sensitivity for csPCa, RF model achieved a negative predictive value (NPV) of 97.5-98.0% and avoided 38.3-52.2% unnecessary biopsies. Decision curve analyses (DCA) showed RF model provided net clinical benefit over PSA, PSA density and ERSPC-RC.
By using the same clinical parameters, ML techniques performed better than ERSPC-RC or PSA density in csPCa predictions, and could avoid up to 50% unnecessary biopsies.
探讨机器学习(ML)在增强前列腺癌(PCa)诊断中的价值。
将 2003 年 1 月至 2017 年 6 月连续进行的系统前列腺活检作为训练队列,将 2017 年 7 月至 2019 年 11 月进行的前瞻性活检作为验证队列。纳入标准为 PSA 为 0.4-50ng/ml,且具有数字直肠检查(DRE)、经直肠超声(TRUS)前列腺体积、TRUS 异常等信息。临床显著前列腺癌(csPCa)定义为 Gleason 评分 3+4 或以上的癌症。使用 PSA、PSA 密度、欧洲前列腺癌筛查随机研究(ERSPC)风险计算器(ERSPC-RC)和各种基于 PSA、DRE 和 TRUS 信息的 ML 技术,比较受试者工作特征(ROC)曲线下面积(AUC)。使用的 ML 技术包括 XGBoost、LightGBM、Catboost、支持向量机(SVM)、逻辑回归(LR)和随机森林(RF),其中应用了代价敏感学习。
训练和验证队列分别包括 3881 名和 778 名连续男性。在验证队列中,RF 模型在预测 PCa 或 csPCa 方面的表现优于其他 ML 技术和 PSA、PSA 密度和 ERSPC-RC。在 csPCa 预测中,PSA、PSA 密度、ERSPC-RC 和 RF 的 AUC 分别为 0.71、0.80、0.83 和 0.88。在 90-95%的 csPCa 敏感性时,RF 模型的阴性预测值(NPV)为 97.5-98.0%,可避免 38.3-52.2%的不必要活检。决策曲线分析(DCA)显示,RF 模型在预测 csPCa 方面提供了比 PSA、PSA 密度和 ERSPC-RC 更优的净临床获益。
使用相同的临床参数,ML 技术在 csPCa 预测方面优于 ERSPC-RC 或 PSA 密度,可避免多达 50%的不必要活检。