• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

量化电子健康记录对最具危害性乳腺癌的预测能力。

Quantifying predictive capability of electronic health records for the most harmful breast cancer.

作者信息

Wu Yirong, Fan Jun, Peissig Peggy, Berg Richard, Tafti Ahmad Pahlavan, Yin Jie, Yuan Ming, Page David, Cox Jennifer, Burnside Elizabeth S

机构信息

University of Wisconsin Madison, WI, USA.

Marshfield Clinic, Marshfield, WI, USA.

出版信息

Proc SPIE Int Soc Opt Eng. 2018 Feb;10577. doi: 10.1117/12.2293954. Epub 2018 Mar 7.

DOI:10.1117/12.2293954
PMID:29706685
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5914175/
Abstract

Improved prediction of the "most harmful" breast cancers that cause the most substantive morbidity and mortality would enable physicians to target more intense screening and preventive measures at those women who have the highest risk; however, such prediction models for the "most harmful" breast cancers have rarely been developed. Electronic health records (EHRs) represent an underused data source that has great research and clinical potential. Our goal was to quantify the value of EHR variables in the "most harmful" breast cancer risk prediction. We identified 794 subjects who had breast cancer with primary non-benign tumors with their earliest diagnosis on or after 1/1/2004 from an existing personalized medicine data repository, including 395 "most harmful" breast cancer cases and 399 "least harmful" breast cancer cases. For these subjects, we collected EHR data comprised of 6 components: demographics, diagnoses, symptoms, procedures, medications, and laboratory results. We developed two regularized prediction models, Ridge Logistic Regression (Ridge-LR) and Lasso Logistic Regression (Lasso-LR), to predict the "most harmful" breast cancer one year in advance. The area under the ROC curve (AUC) was used to assess model performance. We observed that the AUCs of Ridge-LR and Lasso-LR models were 0.818 and 0.839 respectively. For both the Ridge-LR and Lasso-LR models, the predictive performance of the whole EHR variables was significantly higher than that of each individual component (p<0.001). In conclusion, EHR variables can be used to predict the "most harmful" breast cancer, providing the possibility to personalize care for those women at the highest risk in clinical practice.

摘要

对导致最高发病率和死亡率的“最具危害性”乳腺癌进行更准确的预测,将使医生能够针对那些风险最高的女性采取更密集的筛查和预防措施;然而,针对“最具危害性”乳腺癌的此类预测模型却很少被开发出来。电子健康记录(EHRs)是一种未得到充分利用的数据来源,具有巨大的研究和临床潜力。我们的目标是量化EHR变量在“最具危害性”乳腺癌风险预测中的价值。我们从一个现有的个性化医疗数据存储库中,识别出794名在2004年1月1日或之后首次被诊断出患有原发性非良性肿瘤的乳腺癌患者,其中包括395例“最具危害性”乳腺癌病例和399例“危害性最小”乳腺癌病例。对于这些受试者,我们收集了由6个部分组成的EHR数据:人口统计学信息、诊断结果、症状、治疗程序、用药情况和实验室检查结果。我们开发了两种正则化预测模型,即岭逻辑回归(Ridge-LR)和套索逻辑回归(Lasso-LR),以提前一年预测“最具危害性”乳腺癌。ROC曲线下面积(AUC)用于评估模型性能。我们观察到,Ridge-LR和Lasso-LR模型的AUC分别为0.818和0.839。对于Ridge-LR和Lasso-LR模型,整个EHR变量的预测性能显著高于每个单独的组成部分(p<0.001)。总之,EHR变量可用于预测“最具危害性”乳腺癌,为临床实践中那些风险最高的女性提供个性化护理的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/69b2e0bcef8a/nihms957895f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/ac87071917cf/nihms957895f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/3cc7ee691eb2/nihms957895f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/69b2e0bcef8a/nihms957895f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/ac87071917cf/nihms957895f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/3cc7ee691eb2/nihms957895f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f0d/5914175/69b2e0bcef8a/nihms957895f3.jpg

相似文献

1
Quantifying predictive capability of electronic health records for the most harmful breast cancer.量化电子健康记录对最具危害性乳腺癌的预测能力。
Proc SPIE Int Soc Opt Eng. 2018 Feb;10577. doi: 10.1117/12.2293954. Epub 2018 Mar 7.
2
Comparison of machine-learning and logistic regression models for prediction of 30-day unplanned readmission in electronic health records: A development and validation study.电子健康记录中预测30天非计划再入院的机器学习模型与逻辑回归模型比较:一项开发与验证研究
PLOS Digit Health. 2024 Aug 20;3(8):e0000578. doi: 10.1371/journal.pdig.0000578. eCollection 2024 Aug.
3
Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): A retrospective, single-site study.使用自动整理的电子健康记录数据(Pythia)开发和验证机器学习模型以识别高风险手术患者:一项回顾性、单站点研究。
PLoS Med. 2018 Nov 27;15(11):e1002701. doi: 10.1371/journal.pmed.1002701. eCollection 2018 Nov.
4
Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using and Gene Variants: Tehran Cardiometabolic Genetic Study.使用和基因变异预测代谢综合征的正则化机器学习模型:德黑兰心脏代谢遗传学研究。
Cell J. 2023 Aug 1;25(8):536-545. doi: 10.22074/cellj.2023.2000864.1294.
5
Predicting Prostate Cancer Upgrading of Biopsy Gleason Grade Group at Radical Prostatectomy Using Machine Learning-Assisted Decision-Support Models.使用机器学习辅助决策支持模型预测前列腺癌根治术时活检Gleason分级组的升级情况。
Cancer Manag Res. 2020 Dec 22;12:13099-13110. doi: 10.2147/CMAR.S286167. eCollection 2020.
6
Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.使用树套索逻辑回归构建用于儿科医院再入院的可解释预测模型。
Artif Intell Med. 2016 Sep;72:12-21. doi: 10.1016/j.artmed.2016.07.003. Epub 2016 Jul 29.
7
Machine Learning Models to Predict Kidney Stone Recurrence Using 24 Hour Urine Testing and Electronic Health Record-Derived Features.使用24小时尿液检测和电子健康记录衍生特征预测肾结石复发的机器学习模型
Res Sq. 2023 Jun 29:rs.3.rs-3107998. doi: 10.21203/rs.3.rs-3107998/v1.
8
A Comparison of Logistic Regression Against Machine Learning Algorithms for Gastric Cancer Risk Prediction Within Real-World Clinical Data Streams.基于真实世界临床数据流的胃癌风险预测中逻辑回归与机器学习算法的比较。
JCO Clin Cancer Inform. 2022 Jun;6:e2200039. doi: 10.1200/CCI.22.00039.
9
The application of unsupervised deep learning in predictive models using electronic health records.无监督深度学习在电子健康记录预测模型中的应用。
BMC Med Res Methodol. 2020 Feb 26;20(1):37. doi: 10.1186/s12874-020-00923-1.
10
Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU.使用机器学习方法预测 ICU 中脓毒症患者的院内死亡率。
BMC Med Inform Decis Mak. 2020 Oct 2;20(1):251. doi: 10.1186/s12911-020-01271-2.

引用本文的文献

1
A simple guide to the use of Student's t-test, Mann-Whitney U test, Chi-squared test, and Kruskal-Wallis test in biostatistics.生物统计学中使用学生t检验、曼-惠特尼U检验、卡方检验和克鲁斯卡尔-沃利斯检验的简易指南。
BioData Min. 2025 Aug 20;18(1):56. doi: 10.1186/s13040-025-00465-6.
2
User Comprehension and EHR Integration of the Decision Aid for Breast Cancer Risk Assessment: A Qualitative Study.乳腺癌风险评估决策辅助工具的用户理解与电子健康记录整合:一项定性研究
AMIA Annu Symp Proc. 2025 May 22;2024:1129-1138. eCollection 2024.
3
Clinical and genetic contributions to medical comorbidity in bipolar disorder: a study using electronic health records-linked biobank data.双相情感障碍中医疗共病的临床和遗传贡献:一项使用电子健康记录关联生物样本库数据的研究。
Mol Psychiatry. 2024 Sep;29(9):2701-2713. doi: 10.1038/s41380-024-02530-8. Epub 2024 Mar 28.
4
Use of Sequential Hot-Deck Imputation for Missing Health Care Systems Data for Population Health Research.利用连续热屉插补法填补人口健康研究中医疗保健系统数据的缺失。
Med Care. 2024 May 1;62(5):319-325. doi: 10.1097/MLR.0000000000001995. Epub 2024 Mar 28.
5
Leveraging Electronic Health Records to Address Breast Cancer Disparities.利用电子健康记录解决乳腺癌差异问题。
Curr Breast Cancer Rep. 2022;14(4):199-204. doi: 10.1007/s12609-022-00457-z. Epub 2022 Sep 3.
6
A Comparison of Logistic Regression Against Machine Learning Algorithms for Gastric Cancer Risk Prediction Within Real-World Clinical Data Streams.基于真实世界临床数据流的胃癌风险预测中逻辑回归与机器学习算法的比较。
JCO Clin Cancer Inform. 2022 Jun;6:e2200039. doi: 10.1200/CCI.22.00039.
7
Personalized Risk-Based Screening Design for Comparative Two-Arm Group Sequential Clinical Trials.用于比较双臂成组序贯临床试验的基于风险的个性化筛查设计
J Pers Med. 2022 Mar 12;12(3):448. doi: 10.3390/jpm12030448.
8
Electronic health records and patient registries in medical oncology departments in Spain.西班牙肿瘤医学部门的电子健康记录和患者注册系统。
Clin Transl Oncol. 2021 Oct;23(10):2099-2108. doi: 10.1007/s12094-021-02614-9. Epub 2021 Apr 17.
9
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities.基于与电子健康记录相关联的生物银行的健康研究的新兴领域:现有资源、统计挑战和潜在机会。
Stat Med. 2020 Mar 15;39(6):773-800. doi: 10.1002/sim.8445. Epub 2019 Dec 20.
10
Implementing Artificial Intelligence and Digital Health in Resource-Limited Settings? Top 10 Lessons We Learned in Congenital Heart Defects and Cardiology.在资源有限的环境中实施人工智能和数字健康?我们在先天性心脏病和心脏病学中获得的十大经验教训。
OMICS. 2020 May;24(5):264-277. doi: 10.1089/omi.2019.0142. Epub 2019 Oct 8.

本文引用的文献

1
Structure-Leveraged Methods in Breast Cancer Risk Prediction.乳腺癌风险预测中的结构杠杆法
J Mach Learn Res. 2016 Dec;17.
2
Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review.利用电子健康记录数据开发风险预测模型的机遇与挑战:一项系统综述
J Am Med Inform Assoc. 2017 Jan;24(1):198-208. doi: 10.1093/jamia/ocw042. Epub 2016 May 17.
3
Comparing Mammography Abnormality Features to Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy.在推荐进行乳腺活检的女性中,比较乳腺钼靶异常特征与基因变异对乳腺癌的预测作用。
Acad Radiol. 2016 Jan;23(1):62-9. doi: 10.1016/j.acra.2015.09.007. Epub 2015 Oct 26.
4
National expenditure for false-positive mammograms and breast cancer overdiagnoses estimated at $4 billion a year.每年因乳腺钼靶检查假阳性和乳腺癌过度诊断产生的国家支出估计为40亿美元。
Health Aff (Millwood). 2015 Apr;34(4):576-83. doi: 10.1377/hlthaff.2014.1087.
5
Overdiagnosis of breast cancer at screening is clinically insignificant.筛查时乳腺癌的过度诊断在临床上并无意义。
Acad Radiol. 2015 Aug;22(8):961-6. doi: 10.1016/j.acra.2015.01.020. Epub 2015 Mar 18.
6
Predicting invasive breast cancer versus DCIS in different age groups.预测不同年龄组浸润性乳腺癌与导管原位癌的情况。
BMC Cancer. 2014 Aug 11;14:584. doi: 10.1186/1471-2407-14-584.
7
Reduction in late-stage breast cancer incidence in the mammography era: Implications for overdiagnosis of invasive cancer.乳腺钼靶筛查时代晚期乳腺癌发病率的降低:对浸润性癌过度诊断的影响
Cancer. 2014 Sep 1;120(17):2649-56. doi: 10.1002/cncr.28784. Epub 2014 May 19.
8
How to control confounding effects by statistical analysis.如何通过统计分析控制混杂效应。
Gastroenterol Hepatol Bed Bench. 2012 Spring;5(2):79-83.
9
A comprehensive methodology for determining the most informative mammographic features.一种全面的方法,用于确定最具信息量的乳腺 X 线摄影特征。
J Digit Imaging. 2013 Oct;26(5):941-7. doi: 10.1007/s10278-013-9588-5.
10
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.