• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

可解释的机器学习将多基因风险评分确定为英国生物银行中胰腺癌风险的关键预测指标。

Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank.

作者信息

Peduzzi Giulia, Felici Alessio, Pellungrini Roberto, Campa Daniele

机构信息

Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.

Classe di scienze, Scuola Normale Superiore, Piazza dei Cavalieri, 7 - 56126, Pisa, Italy.

出版信息

Dig Liver Dis. 2025 Apr;57(4):915-922. doi: 10.1016/j.dld.2024.11.010. Epub 2024 Dec 3.

DOI:10.1016/j.dld.2024.11.010
PMID:39632152
Abstract

BACKGROUND

Predicting the risk of developing pancreatic ductal adenocarcinoma (PDAC) is of paramount importance, given its high mortality rate. Current PDAC risk prediction models rely on a limited number of variables, do not include genetics, and have a modest accuracy.

AIM

This study aimed to develop an interpretable PDAC risk prediction model, based on machine learning (ML).

METHODS

Five ML models (Adaptive Boosting, eXtreme Gradient Boosting, CatBoost, Deep Forest and Random Forest) built on 56 exposome variables and a polygenic risk score (PRS) were tested in 654 PDAC cases and 1,308 controls of the UK Biobank. Additionally, SHapley Additive exPlanation (SHAP) and Global model Interpretation via the Recursive Partitioning (Girp) were employed to explain the models.

RESULTS

All models provided similar performance, but based on recall the best was CatBoost (77.10 %). SHAP highlighted age and the PRS as primary contributors across all models. Girp developed rules to discern cases from controls, identifying age, PRS, and pancreatitis in most of the rules.

CONCLUSION

The predictive models tested have exhibited good performance, indicating their potential application in the clinical field in the near future, with the PRS playing a key role in identifying high-risk individuals as demonstrated by the explainers.

摘要

背景

鉴于胰腺导管腺癌(PDAC)的高死亡率,预测其发病风险至关重要。当前的PDAC风险预测模型依赖于有限的变量,未纳入遗传学因素,且准确性一般。

目的

本研究旨在基于机器学习(ML)开发一种可解释的PDAC风险预测模型。

方法

在英国生物银行的654例PDAC病例和1308例对照中,测试了基于56个暴露组变量和多基因风险评分(PRS)构建的5种ML模型(自适应提升、极端梯度提升、CatBoost、深度森林和随机森林)。此外,还采用了夏普利值加法解释(SHAP)和通过递归划分进行全局模型解释(Girp)来解释这些模型。

结果

所有模型表现相似,但基于召回率,最佳模型是CatBoost(77.10%)。SHAP强调年龄和PRS是所有模型的主要贡献因素。Girp制定了区分病例和对照的规则,在大多数规则中识别出年龄、PRS和胰腺炎。

结论

所测试的预测模型表现良好,表明其在不久的将来在临床领域的潜在应用,正如解释器所表明的,PRS在识别高危个体中起关键作用。

相似文献

1
Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank.可解释的机器学习将多基因风险评分确定为英国生物银行中胰腺癌风险的关键预测指标。
Dig Liver Dis. 2025 Apr;57(4):915-922. doi: 10.1016/j.dld.2024.11.010. Epub 2024 Dec 3.
2
Predicting Pancreatic Cancer in the UK Biobank Cohort Using Polygenic Risk Scores and Diabetes Mellitus.利用多基因风险评分和糖尿病预测英国生物银行队列中的胰腺癌
Gastroenterology. 2022 May;162(6):1665-1674.e2. doi: 10.1053/j.gastro.2022.01.016. Epub 2022 Jan 21.
3
Analysis of exposome and genetic variability suggests stress as a major contributor for development of pancreatic ductal adenocarcinoma.分析外显子组和遗传变异性表明,应激是导致胰腺导管腺癌发展的主要因素。
Dig Liver Dis. 2024 Jun;56(6):1054-1063. doi: 10.1016/j.dld.2023.10.015. Epub 2023 Nov 18.
4
Prediction of Suicidal Behaviors in the Middle-aged Population: Machine Learning Analyses of UK Biobank.预测中年人群的自杀行为:英国生物库的机器学习分析。
JMIR Public Health Surveill. 2023 Feb 20;9:e43419. doi: 10.2196/43419.
5
Fracture risk prediction in postmenopausal women with traditional and machine learning models in a nationwide, prospective cohort study in Switzerland with validation in the UK Biobank.在瑞士进行的一项全国性前瞻性队列研究中,使用传统和机器学习模型对绝经后妇女进行骨折风险预测,并在英国生物库中进行验证。
J Bone Miner Res. 2024 Aug 21;39(8):1103-1112. doi: 10.1093/jbmr/zjae089.
6
Assessing the Value of Incorporating a Polygenic Risk Score with Nongenetic Factors for Predicting Breast Cancer Diagnosis in the UK Biobank.评估将多基因风险评分与非遗传因素相结合用于预测英国生物银行中乳腺癌诊断的价值。
Cancer Epidemiol Biomarkers Prev. 2024 Jun 3;33(6):812-820. doi: 10.1158/1055-9965.EPI-23-1432.
7
Pancreatitis polygenic risk score is independently associated with all-cause acute pancreatitis risk in the UK Biobank.在英国生物银行中,胰腺炎多基因风险评分与全因急性胰腺炎风险独立相关。
J Gastroenterol Hepatol. 2024 Dec;39(12):2639-2644. doi: 10.1111/jgh.16759. Epub 2024 Oct 10.
8
The role of aspirin in the prevention of pancreatic cancer: A nested case-control study in the UK Biobank.阿司匹林在胰腺癌预防中的作用:英国生物库中的巢式病例对照研究。
Pancreatology. 2024 Sep;24(6):947-953. doi: 10.1016/j.pan.2024.08.005. Epub 2024 Aug 10.
9
Machine learning model-based prediction of postpancreatectomy acute pancreatitis following pancreaticoduodenectomy: A retrospective cohort study.基于机器学习模型对胰十二指肠切除术后胰十二指肠切除术后急性胰腺炎的预测:一项回顾性队列研究。
World J Gastroenterol. 2025 Feb 28;31(8):102071. doi: 10.3748/wjg.v31.i8.102071.
10
Development of a Polygenic Risk Score for Metabolic Dysfunction-Associated Steatotic Liver Disease Prediction in UK Biobank.用于在英国生物银行中预测代谢功能障碍相关脂肪性肝病的多基因风险评分的开发
Genes (Basel). 2024 Dec 28;16(1):33. doi: 10.3390/genes16010033.