• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于机器学习识别新的、未被识别的危险因素,在患者入院时即时对普通患者人群进行肺栓塞的早期检测:模型开发研究。

Early Detection of Pulmonary Embolism in a General Patient Population Immediately Upon Hospital Admission Using Machine Learning to Identify New, Unidentified Risk Factors: Model Development Study.

机构信息

Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Education Authority, Chaim Sheba Medical Center, Faculty of Health Science and Medicine, Tel-Aviv University, Tel-Aviv, Israel.

出版信息

J Med Internet Res. 2024 Jul 30;26:e48595. doi: 10.2196/48595.

DOI:10.2196/48595
PMID:39079116
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11322683/
Abstract

BACKGROUND

Under- or late identification of pulmonary embolism (PE)-a thrombosis of 1 or more pulmonary arteries that seriously threatens patients' lives-is a major challenge confronting modern medicine.

OBJECTIVE

We aimed to establish accurate and informative machine learning (ML) models to identify patients at high risk for PE as they are admitted to the hospital, before their initial clinical checkup, by using only the information in their medical records.

METHODS

We collected demographics, comorbidities, and medications data for 2568 patients with PE and 52,598 control patients. We focused on data available prior to emergency department admission, as these are the most universally accessible data. We trained an ML random forest algorithm to detect PE at the earliest possible time during a patient's hospitalization-at the time of his or her admission. We developed and applied 2 ML-based methods specifically to address the data imbalance between PE and non-PE patients, which causes misdiagnosis of PE.

RESULTS

The resulting models predicted PE based on age, sex, BMI, past clinical PE events, chronic lung disease, past thrombotic events, and usage of anticoagulants, obtaining an 80% geometric mean value for the PE and non-PE classification accuracies. Although on hospital admission only 4% (1942/46,639) of the patients had a diagnosis of PE, we identified 2 clustering schemes comprising subgroups with more than 61% (705/1120 in clustering scheme 1; 427/701 and 340/549 in clustering scheme 2) positive patients for PE. One subgroup in the first clustering scheme included 36% (705/1942) of all patients with PE who were characterized by a definite past PE diagnosis, a 6-fold higher prevalence of deep vein thrombosis, and a 3-fold higher prevalence of pneumonia, compared with patients of the other subgroups in this scheme. In the second clustering scheme, 2 subgroups (1 of only men and 1 of only women) included patients who all had a past PE diagnosis and a relatively high prevalence of pneumonia, and a third subgroup included only those patients with a past diagnosis of pneumonia.

CONCLUSIONS

This study established an ML tool for early diagnosis of PE almost immediately upon hospital admission. Despite the highly imbalanced scenario undermining accurate PE prediction and using information available only from the patient's medical history, our models were both accurate and informative, enabling the identification of patients already at high risk for PE upon hospital admission, even before the initial clinical checkup was performed. The fact that we did not restrict our patients to those at high risk for PE according to previously published scales (eg, Wells or revised Genova scores) enabled us to accurately assess the application of ML on raw medical data and identify new, previously unidentified risk factors for PE, such as previous pulmonary disease, in general populations.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/c7e16d2cd3bb/jmir_v26i1e48595_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/744b01a2642d/jmir_v26i1e48595_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/2a33f1af5def/jmir_v26i1e48595_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/cd4bad39166f/jmir_v26i1e48595_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/3393f0e547e9/jmir_v26i1e48595_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/c7e16d2cd3bb/jmir_v26i1e48595_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/744b01a2642d/jmir_v26i1e48595_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/2a33f1af5def/jmir_v26i1e48595_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/cd4bad39166f/jmir_v26i1e48595_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/3393f0e547e9/jmir_v26i1e48595_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1921/11322683/c7e16d2cd3bb/jmir_v26i1e48595_fig5.jpg
摘要

背景

肺栓塞(PE)的漏诊或延迟诊断——即肺动脉内发生 1 个或多个血栓,严重威胁着患者的生命——是现代医学面临的一大挑战。

目的

我们旨在建立准确且信息丰富的机器学习(ML)模型,以便在患者住院但尚未接受初始临床检查时,仅使用其病历中的信息来识别发生 PE 的高风险患者。

方法

我们收集了 2568 例 PE 患者和 52598 例对照患者的人口统计学、合并症和用药数据。我们重点关注急诊入院前的数据,因为这些数据是最普遍可获得的数据。我们训练了一个 ML 随机森林算法,以便在患者住院期间尽早(入院时)检测到 PE。我们开发并应用了 2 种基于 ML 的方法,专门解决 PE 患者和非 PE 患者之间数据不平衡的问题,这种不平衡会导致 PE 的误诊。

结果

该模型基于年龄、性别、BMI、既往临床 PE 事件、慢性肺部疾病、既往血栓事件和抗凝药物使用情况来预测 PE,PE 和非 PE 分类准确率的几何平均值为 80%。尽管入院时只有 4%(1942/46639)的患者被诊断为 PE,但我们确定了 2 种聚类方案,包括具有超过 61%(聚类方案 1 中为 705/1120;聚类方案 2 中为 427/701 和 340/549)阳性患者的亚组。第一个聚类方案中的一个亚组包括所有 PE 患者中的 36%(705/1942),这些患者具有明确的既往 PE 诊断、深静脉血栓形成的患病率高 6 倍以及肺炎的患病率高 3 倍,与该方案中其他亚组的患者相比。在第二个聚类方案中,2 个亚组(均为男性或均为女性)包括所有既往有 PE 诊断和相对较高肺炎患病率的患者,第三个亚组仅包括既往患有肺炎的患者。

结论

本研究建立了一种用于在患者入院后几乎立即进行 PE 早期诊断的 ML 工具。尽管准确预测 PE 的情况受到严重的不平衡状态的影响,并且仅使用患者的病史信息,但我们的模型既准确又提供了丰富的信息,能够识别出即使在进行初始临床检查之前,患者入院时已经处于高风险的 PE。我们没有根据先前发表的量表(如 Wells 或改良 Genova 量表)将患者限制在 PE 高危人群中,这使我们能够准确评估 ML 在原始医疗数据上的应用,并识别出肺栓塞的新的、以前未被识别的风险因素,如既往肺部疾病,在一般人群中。

相似文献

1
Early Detection of Pulmonary Embolism in a General Patient Population Immediately Upon Hospital Admission Using Machine Learning to Identify New, Unidentified Risk Factors: Model Development Study.基于机器学习识别新的、未被识别的危险因素,在患者入院时即时对普通患者人群进行肺栓塞的早期检测:模型开发研究。
J Med Internet Res. 2024 Jul 30;26:e48595. doi: 10.2196/48595.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
5
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
6
Risk of thromboembolism in patients with COVID-19 who are using hormonal contraception.COVID-19 患者使用激素避孕的血栓栓塞风险。
Cochrane Database Syst Rev. 2023 Jan 9;1(1):CD014908. doi: 10.1002/14651858.CD014908.pub2.
7
Sex as a prognostic factor for mortality in adults with acute symptomatic pulmonary embolism.性别作为急性症状性肺栓塞成年患者死亡率的一个预后因素。
Cochrane Database Syst Rev. 2025 Mar 20;3(3):CD013835. doi: 10.1002/14651858.CD013835.pub2.
8
Computer and mobile technology interventions for self-management in chronic obstructive pulmonary disease.用于慢性阻塞性肺疾病自我管理的计算机和移动技术干预措施。
Cochrane Database Syst Rev. 2017 May 23;5(5):CD011425. doi: 10.1002/14651858.CD011425.pub2.
9
Home versus in-patient treatment for deep vein thrombosis.深静脉血栓形成的家庭治疗与住院治疗对比
Cochrane Database Syst Rev. 2018 Jan 9;1(1):CD003076. doi: 10.1002/14651858.CD003076.pub3.
10
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

引用本文的文献

1
Research progress of artificial intelligence and machine learning in pulmonary embolism.人工智能与机器学习在肺栓塞中的研究进展
Front Med (Lausanne). 2025 Mar 27;12:1577559. doi: 10.3389/fmed.2025.1577559. eCollection 2025.
2
Large Language Model Approach for Zero-Shot Information Extraction and Clustering of Japanese Radiology Reports: Algorithm Development and Validation.用于日本放射学报告的零样本信息提取和聚类的大语言模型方法:算法开发与验证
JMIR Cancer. 2025 Jan 23;11:e57275. doi: 10.2196/57275.

本文引用的文献

1
Diagnostic management of acute pulmonary embolism: a prediction model based on a patient data meta-analysis.急性肺栓塞的诊断管理:基于患者数据荟萃分析的预测模型。
Eur Heart J. 2023 Aug 22;44(32):3073-3081. doi: 10.1093/eurheartj/ehad417.
2
Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients.大规模外部验证机器学习算法以预测住院患者的肺栓塞。
Thromb Res. 2022 Aug;216:14-21. doi: 10.1016/j.thromres.2022.05.016. Epub 2022 Jun 2.
3
Predicting pulmonary embolism among hospitalized patients with machine learning algorithms.
使用机器学习算法预测住院患者的肺栓塞
Pulm Circ. 2022 Jan 11;12(1):e12013. doi: 10.1002/pul2.12013. eCollection 2022 Jan.
4
Lipid level alteration in human and cellular models of alpha synuclein mutations.α-突触核蛋白突变的人类和细胞模型中的血脂水平改变。
NPJ Parkinsons Dis. 2022 Apr 25;8(1):52. doi: 10.1038/s41531-022-00313-y.
5
Lipidomics Prediction of Parkinson's Disease Severity: A Machine-Learning Analysis.脂质组学预测帕金森病严重程度:一项机器学习分析。
J Parkinsons Dis. 2021;11(3):1141-1155. doi: 10.3233/JPD-202476.
6
Spot the difference: comparing results of analyses from real patient data and synthetic derivatives.找出差异:比较来自真实患者数据和合成衍生物的分析结果。
JAMIA Open. 2020 Dec 14;3(4):557-566. doi: 10.1093/jamiaopen/ooaa060. eCollection 2020 Dec.
7
Analyzing Medical Research Results Based on Synthetic Data and Their Relation to Real Data Results: Systematic Comparison From Five Observational Studies.基于合成数据的医学研究结果分析及其与真实数据结果的关系:五项观察性研究的系统比较
JMIR Med Inform. 2020 Feb 20;8(2):e16492. doi: 10.2196/16492.
8
Insights into Amyotrophic Lateral Sclerosis from a Machine Learning Perspective.从机器学习角度洞察肌萎缩侧索硬化症
J Clin Med. 2019 Oct 1;8(10):1578. doi: 10.3390/jcm8101578.
9
Development and Performance of the Pulmonary Embolism Result Forecast Model (PERFORM) for Computed Tomography Clinical Decision Support.用于计算机断层扫描临床决策支持的肺栓塞结果预测模型 (PERFORM) 的开发和性能。
JAMA Netw Open. 2019 Aug 2;2(8):e198719. doi: 10.1001/jamanetworkopen.2019.8719.
10
Young driver fatal motorcycle accident analysis by jointly maximizing accuracy and information.联合最大化准确性和信息量的年轻驾驶员致命摩托车事故分析
Accid Anal Prev. 2019 Aug;129:350-361. doi: 10.1016/j.aap.2019.04.016. Epub 2019 Jun 12.