Suppr超能文献

基于大规模美国临床数据开发和验证的胰腺癌风险预测模型(Prism)。

A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale US clinical data.

机构信息

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.

TriNetX, LLC, Cambridge, MA, 02140, USA.

出版信息

EBioMedicine. 2023 Dec;98:104888. doi: 10.1016/j.ebiom.2023.104888. Epub 2023 Nov 25.

Abstract

BACKGROUND

Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection.

METHODS

Neural Network (PrismNN) and Logistic Regression (PrismLR) were developed using EHR data from 55 US Health Care Organisations (HCOs) to predict PDAC risk 6-18 months before diagnosis for patients 40 years or older. Model performance was assessed using Area Under the Curve (AUC) and calibration plots. Models were internal-externally validated by geographic location, race, and time. Simulated model deployment evaluated Standardised Incidence Ratio (SIR) and other metrics.

FINDINGS

With 35,387 PDAC cases, 1,500,081 controls, and 87 features per patient, PrismNN obtained a test AUC of 0.826 (95% CI: 0.824-0.828) (PrismLR: 0.800 (95% CI: 0.798-0.802)). PrismNN's average internal-external validation AUCs were 0.740 for locations, 0.828 for races, and 0.789 (95% CI: 0.762-0.816) for time. At SIR = 5.10 (exceeding the current screening inclusion threshold) in simulated model deployment, PrismNN sensitivity was 35.9% (specificity 95.3%).

INTERPRETATION

Prism models demonstrated good accuracy and generalizability across diverse populations. PrismNN could find 3.5 times more cases at comparable risk than current screening guidelines. The small number of features provided a basis for model interpretation. Integration with the federated network provided data from a large, heterogeneous patient population and a pathway to future clinical deployment.

FUNDING

Prevent Cancer Foundation, TriNetX, Boeing, DARPA, NSF, and Aarno Labs.

摘要

背景

胰腺导管腺癌 (PDAC) 筛查可以实现疾病的早期发现和长期生存。目前的指南使用遗传易感性,约有 10%的 PDAC 病例符合筛查条件。我们利用来自多机构联邦网络的电子健康记录 (EHR) 数据,为一般美国人群开发并验证了 PDAC RISk 模型 (Prism),以扩大早期 PDAC 的检测范围。

方法

使用来自 55 个美国医疗保健组织 (HCO) 的 EHR 数据,使用神经网络 (PrismNN) 和逻辑回归 (PrismLR) 来预测 40 岁及以上患者诊断前 6-18 个月的 PDAC 风险。使用曲线下面积 (AUC) 和校准图评估模型性能。通过地理位置、种族和时间对模型进行内部-外部验证。模拟模型部署评估标准化发病比 (SIR) 和其他指标。

结果

研究纳入了 35387 例 PDAC 病例、1500081 例对照和每位患者 87 个特征,PrismNN 的测试 AUC 为 0.826(95%CI:0.824-0.828)(PrismLR:0.800(95%CI:0.798-0.802))。PrismNN 的平均内部-外部验证 AUC 分别为地理位置 0.740、种族 0.828 和时间 0.789(95%CI:0.762-0.816)。在模拟模型部署中,当 SIR=5.10(超过当前筛查纳入标准)时,PrismNN 的敏感性为 35.9%(特异性 95.3%)。

解释

Prism 模型在不同人群中表现出良好的准确性和泛化能力。PrismNN 可以在可比风险下发现 3.5 倍的病例,而不是当前的筛查指南。较少的特征为模型解释提供了基础。与联邦网络的整合提供了来自大型异质患者群体的数据,并为未来的临床部署提供了途径。

资助

预防癌症基金会、TriNetX、波音公司、DARPA、NSF 和 Aarno Labs。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2bd6/10755107/bb959b3544bc/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验