为前列腺癌风险预测而适应异质缺失数据模式。

Accommodating heterogeneous missing data patterns for prostate cancer risk prediction.

机构信息

Department of Life Sciences, Technical University of Munich, Freising, Germany.

Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, OH, USA.

出版信息

BMC Med Res Methodol. 2022 Jul 21;22(1):200. doi: 10.1186/s12874-022-01674-x.

DOI:10.1186/s12874-022-01674-x

PMID:35864460

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9306143/

Abstract

BACKGROUND

We compared six commonly used logistic regression methods for accommodating missing risk factor data from multiple heterogeneous cohorts, in which some cohorts do not collect some risk factors at all, and developed an online risk prediction tool that accommodates missing risk factors from the end-user.

METHODS

Ten North American and European cohorts from the Prostate Biopsy Collaborative Group (PBCG) were used for fitting a risk prediction tool for clinically significant prostate cancer, defined as Gleason grade group ≥ 2 on standard TRUS prostate biopsy. One large European PBCG cohort was withheld for external validation, where calibration-in-the-large (CIL), calibration curves, and area-underneath-the-receiver-operating characteristic curve (AUC) were evaluated. Ten-fold leave-one-cohort-internal validation further validated the optimal missing data approach.

RESULTS

Among 12,703 biopsies from 10 training cohorts, 3,597 (28%) had clinically significant prostate cancer, compared to 1,757 of 5,540 (32%) in the external validation cohort. In external validation, the available cases method that pooled individual patient data containing all risk factors input by an end-user had best CIL, under-predicting risks as percentages by 2.9% on average, and obtained an AUC of 75.7%. Imputation had the worst CIL (-13.3%). The available cases method was further validated as optimal in internal cross-validation and thus used for development of an online risk tool. For end-users of the risk tool, two risk factors were mandatory: serum prostate-specific antigen (PSA) and age, and ten were optional: digital rectal exam, prostate volume, prior negative biopsy, 5-alpha-reductase-inhibitor use, prior PSA screen, African ancestry, Hispanic ethnicity, first-degree prostate-, breast-, and second-degree prostate-cancer family history.

CONCLUSION

Developers of clinical risk prediction tools should optimize use of available data and sources even in the presence of high amounts of missing data and offer options for users with missing risk factors.

摘要

背景

我们比较了六种常用于处理来自多个异质队列的缺失风险因素数据的逻辑回归方法，其中一些队列根本不收集某些风险因素，并开发了一个在线风险预测工具，可处理来自最终用户的缺失风险因素。

方法

使用来自前列腺活检协作组（PBCG）的十个北美和欧洲队列来拟合用于临床显著前列腺癌的风险预测工具，定义为标准经直肠超声前列腺活检中 Gleason 分级组≥2。保留一个大型欧洲 PBCG 队列进行外部验证，评估了大校准（CIL）、校准曲线和受试者工作特征曲线（ROC）下面积（AUC）。十折留一队列内部验证进一步验证了最佳缺失数据方法。

结果

在来自 10 个训练队列的 12703 次活检中，10597 例（28%）患有临床显著前列腺癌，而外部验证队列中的 5540 例中有 1757 例（32%）患有该疾病。在外部验证中，包含最终用户输入的所有风险因素的个体患者数据的可用病例方法具有最佳的 CIL，平均平均低估风险百分比为 2.9%，AUC 为 75.7%。插补方法的 CIL 最差（-13.3%）。可用病例方法在内部交叉验证中进一步验证为最优，因此用于开发在线风险工具。对于风险工具的最终用户，有两个风险因素是强制性的：血清前列腺特异性抗原（PSA）和年龄，十个是可选的：直肠指检、前列腺体积、既往阴性活检、5-α-还原酶抑制剂使用、既往 PSA 筛查、非洲裔、西班牙裔、一级前列腺癌、乳腺癌和二级前列腺癌家族史。

结论

即使存在大量缺失数据，临床风险预测工具的开发人员也应优化可用数据和来源的使用，并为缺失风险因素的用户提供选项。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b02c/9306143/7027bd4753d9/12874_2022_1674_Fig1_HTML.jpg

相似文献

Accommodating heterogeneous missing data patterns for prostate cancer risk prediction.为前列腺癌风险预测而适应异质缺失数据模式。

BMC Med Res Methodol. 2022 Jul 21;22(1):200. doi: 10.1186/s12874-022-01674-x.

A Contemporary Prostate Biopsy Risk Calculator Based on Multiple Heterogeneous Cohorts.基于多个异质队列的当代前列腺活检风险计算器。

Eur Urol. 2018 Aug;74(2):197-203. doi: 10.1016/j.eururo.2018.05.003. Epub 2018 May 16.

Multi-cohort modeling strategies for scalable globally accessible prostate cancer risk tools.多队列建模策略用于可扩展的全球可及前列腺癌风险工具。

BMC Med Res Methodol. 2019 Oct 15;19(1):191. doi: 10.1186/s12874-019-0839-0.

Evaluating the PCPT risk calculator in ten international biopsy cohorts: results from the Prostate Biopsy Collaborative Group.评估 10 个国际活检队列中的 PCPT 风险计算器：来自前列腺活检协作组的结果。

World J Urol. 2012 Apr;30(2):181-7. doi: 10.1007/s00345-011-0818-5. Epub 2011 Dec 31.

Prediction of prostate cancer risk: the role of prostate volume and digital rectal examination in the ERSPC risk calculators.前列腺癌风险预测：前列腺体积和直肠指检在 ERSPC 风险计算器中的作用。

Eur Urol. 2012 Mar;61(3):577-83. doi: 10.1016/j.eururo.2011.11.012. Epub 2011 Nov 15.

Evaluating the Prostate Cancer Prevention Trial High Grade Prostate Cancer Risk Calculator in 10 international biopsy cohorts: results from the Prostate Biopsy Collaborative Group.在10个国际活检队列中评估前列腺癌预防试验高级别前列腺癌风险计算器：来自前列腺活检协作组的结果。

World J Urol. 2014 Feb;32(1):185-91. doi: 10.1007/s00345-012-0869-2. Epub 2012 Apr 22.

Development and External Validation of the Korean Prostate Cancer Risk Calculator for High-Grade Prostate Cancer: Comparison with Two Western Risk Calculators in an Asian Cohort.韩国高级别前列腺癌风险计算器的开发与外部验证：在亚洲队列中与两种西方风险计算器的比较。

PLoS One. 2017 Jan 3;12(1):e0168917. doi: 10.1371/journal.pone.0168917. eCollection 2017.

A nomogram for prediction of prostate cancer on multi-core biopsy using age, serum prostate-specific antigen, prostate volume and digital rectal examination in Singapore.新加坡利用年龄、血清前列腺特异性抗原、前列腺体积和直肠指检对多芯活检前列腺癌进行预测的列线图。

Asia Pac J Clin Oncol. 2017 Oct;13(5):e348-e355. doi: 10.1111/ajco.12596. Epub 2016 Sep 19.

A risk calculator to inform the need for a prostate biopsy: a rapid access clinic cohort.用于告知前列腺活检需求的风险计算器：快速通道诊所队列。

BMC Med Inform Decis Mak. 2020 Jul 3;20(1):148. doi: 10.1186/s12911-020-01174-2.

Prospective validation of a risk calculator which calculates the probability of a positive prostate biopsy in a contemporary clinical cohort.前瞻性验证一种风险计算器，该计算器可计算当代临床队列中前列腺活检阳性的概率。

Eur J Cancer. 2012 Aug;48(12):1809-15. doi: 10.1016/j.ejca.2012.02.002. Epub 2012 Mar 7.

本文引用的文献

Predictors of clinically significant prostate cancer in biopsy-naïve and prior negative biopsy men with a negative prostate MRI: improving MRI-based screening with a novel risk calculator.前列腺MRI阴性的初诊活检患者和既往活检阴性患者中具有临床意义的前列腺癌的预测因素：使用新型风险计算器改进基于MRI的筛查。

Ther Adv Urol. 2022 Mar 26;14:17562872221088536. doi: 10.1177/17562872221088536. eCollection 2022 Jan-Dec.

An analysis of three different prostate cancer risk calculators applied prior to prostate biopsy: A Turkish cohort validation study.三种不同前列腺癌风险计算器在前列腺活检前的应用分析：一项土耳其队列验证研究。

Andrologia. 2022 Mar;54(2):e14329. doi: 10.1111/and.14329. Epub 2021 Nov 27.

A comparison of prostate cancer prediction models in men undergoing both magnetic resonance imaging and transperineal biopsy: Are the models still relevant?比较经磁共振成像和经会阴前列腺活检的男性前列腺癌预测模型：这些模型仍然相关吗？

BJU Int. 2021 Dec;128 Suppl 3:36-44. doi: 10.1111/bju.15554. Epub 2021 Aug 9.

Improving prostate biopsy decision making in Mexican patients: Still a major public health concern.提高墨西哥患者前列腺活检决策质量：仍是重大公共卫生关注点。

Urol Oncol. 2021 Dec;39(12):831.e11-831.e18. doi: 10.1016/j.urolonc.2021.05.022. Epub 2021 Jun 27.

Prospective validation of the Kaiser Permanente prostate cancer risk calculator in a contemporary, racially diverse, referral population.在一个当代、多种族、有转诊需求的人群中对 Kaiser Permanente 前列腺癌风险计算器进行前瞻性验证。

Urol Oncol. 2021 Nov;39(11):783.e11-783.e19. doi: 10.1016/j.urolonc.2021.03.023. Epub 2021 May 4.

Real-time imputation of missing predictor values improved the application of prediction models in daily practice.实时插补缺失预测值可提高预测模型在日常实践中的应用。

J Clin Epidemiol. 2021 Jun;134:22-34. doi: 10.1016/j.jclinepi.2021.01.003. Epub 2021 Jan 19.

Informative presence and observation in routine health data: A review of methodology for clinical risk prediction.常规健康数据中的信息性存在和观察：临床风险预测方法学综述。

J Am Med Inform Assoc. 2021 Jan 15;28(1):155-166. doi: 10.1093/jamia/ocaa242.

Handling missing predictor values when validating and applying a prediction model to new patients.在验证预测模型并将其应用于新患者时处理预测变量的缺失值。

Stat Med. 2020 Nov 10;39(25):3591-3607. doi: 10.1002/sim.8682. Epub 2020 Jul 20.

A risk calculator to inform the need for a prostate biopsy: a rapid access clinic cohort.用于告知前列腺活检需求的风险计算器：快速通道诊所队列。

BMC Med Inform Decis Mak. 2020 Jul 3;20(1):148. doi: 10.1186/s12911-020-01174-2.

Missing data should be handled differently for prediction than for description or causal explanation.缺失数据在预测、描述和因果解释方面的处理方式应有所不同。

J Clin Epidemiol. 2020 Sep;125:183-187. doi: 10.1016/j.jclinepi.2020.03.028. Epub 2020 Jun 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

为前列腺癌风险预测而适应异质缺失数据模式。

Accommodating heterogeneous missing data patterns for prostate cancer risk prediction.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献