• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于电子健康记录的中风表型分析方法的比较分析、应用及解读

Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods.

作者信息

Thangaraj Phyllis M, Kummer Benjamin R, Lorberbaum Tal, Elkind Mitchell S V, Tatonetti Nicholas P

机构信息

Department of Biomedical Informatics, Columbia University, 622 W 168th St., PH-20, New York, NY, 10032, USA.

Department of Systems Biology, Columbia University, New York, NY, USA.

出版信息

BioData Min. 2020 Dec 7;13(1):21. doi: 10.1186/s13040-020-00230-x.

DOI:10.1186/s13040-020-00230-x
PMID:33372632
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7720570/
Abstract

BACKGROUND

Accurate identification of acute ischemic stroke (AIS) patient cohorts is essential for a wide range of clinical investigations. Automated phenotyping methods that leverage electronic health records (EHRs) represent a fundamentally new approach cohort identification without current laborious and ungeneralizable generation of phenotyping algorithms. We systematically compared and evaluated the ability of machine learning algorithms and case-control combinations to phenotype acute ischemic stroke patients using data from an EHR.

MATERIALS AND METHODS

Using structured patient data from the EHR at a tertiary-care hospital system, we built and evaluated machine learning models to identify patients with AIS based on 75 different case-control and classifier combinations. We then estimated the prevalence of AIS patients across the EHR. Finally, we externally validated the ability of the models to detect AIS patients without AIS diagnosis codes using the UK Biobank.

RESULTS

Across all models, we found that the mean AUROC for detecting AIS was 0.963 ± 0.0520 and average precision score 0.790 ± 0.196 with minimal feature processing. Classifiers trained with cases with AIS diagnosis codes and controls with no cerebrovascular disease codes had the best average F1 score (0.832 ± 0.0383). In the external validation, we found that the top probabilities from a model-predicted AIS cohort were significantly enriched for AIS patients without AIS diagnosis codes (60-150 fold over expected).

CONCLUSIONS

Our findings support machine learning algorithms as a generalizable way to accurately identify AIS patients without using process-intensive manual feature curation. When a set of AIS patients is unavailable, diagnosis codes may be used to train classifier models.

摘要

背景

准确识别急性缺血性脑卒中(AIS)患者队列对于广泛的临床研究至关重要。利用电子健康记录(EHR)的自动表型分析方法代表了一种全新的队列识别方法,无需当前费力且不可推广的表型算法生成。我们系统地比较和评估了机器学习算法和病例对照组合使用EHR数据对急性缺血性脑卒中患者进行表型分析的能力。

材料与方法

利用三级医疗医院系统中EHR的结构化患者数据,我们构建并评估了机器学习模型,以基于75种不同的病例对照和分类器组合识别AIS患者。然后,我们估计了整个EHR中AIS患者的患病率。最后,我们使用英国生物银行对外验证了模型检测无AIS诊断代码的AIS患者的能力。

结果

在所有模型中,我们发现检测AIS的平均受试者工作特征曲线下面积(AUROC)为0.963±0.0520,平均精确率评分为0.790±0.196,且特征处理最少。使用有AIS诊断代码的病例和无脑血管疾病代码的对照训练的分类器平均F1分数最高(0.832±0.0383)。在外部验证中,我们发现模型预测的AIS队列的最高概率在无AIS诊断代码的AIS患者中显著富集(比预期高60至150倍)。

结论

我们的研究结果支持将机器学习算法作为一种无需使用流程密集型手动特征筛选即可准确识别AIS患者的通用方法。当一组AIS患者不可用时,诊断代码可用于训练分类器模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/9a39a69156bb/13040_2020_230_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/28d55c89afa0/13040_2020_230_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/f83572fafc0b/13040_2020_230_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/7927b683ee37/13040_2020_230_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/9a39a69156bb/13040_2020_230_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/28d55c89afa0/13040_2020_230_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/f83572fafc0b/13040_2020_230_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/7927b683ee37/13040_2020_230_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/949c/7720570/9a39a69156bb/13040_2020_230_Fig4_HTML.jpg

相似文献

1
Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods.基于电子健康记录的中风表型分析方法的比较分析、应用及解读
BioData Min. 2020 Dec 7;13(1):21. doi: 10.1186/s13040-020-00230-x.
2
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
3
Automated Electronic Phenotyping of Cardioembolic Stroke.自动化电子心源性卒中表型分析。
Stroke. 2021 Jan;52(1):181-189. doi: 10.1161/STROKEAHA.120.030663. Epub 2020 Dec 10.
4
A phenotyping algorithm to identify acute ischemic stroke accurately from a national biobank: the Million Veteran Program.一种从国家生物样本库中准确识别急性缺血性卒中的表型分析算法:百万退伍军人计划
Clin Epidemiol. 2018 Oct 16;10:1509-1521. doi: 10.2147/CLEP.S160764. eCollection 2018.
5
Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation.自然语言处理和机器学习在电子健康记录中识别中风事件:算法开发和验证。
J Med Internet Res. 2021 Mar 8;23(3):e22951. doi: 10.2196/22951.
6
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.
7
Identification of Patients With Congestive Heart Failure From the Electronic Health Records of Two Hospitals: Retrospective Study.从两家医院的电子健康记录中识别充血性心力衰竭患者:回顾性研究
JMIR Med Inform. 2025 Apr 10;13:e64113. doi: 10.2196/64113.
8
Automated phenotyping of mild cognitive impairment and Alzheimer's disease and related dementias using electronic health records.利用电子健康记录对轻度认知障碍、阿尔茨海默病及相关痴呆症进行自动表型分析。
Int J Med Inform. 2025 Aug;200:105917. doi: 10.1016/j.ijmedinf.2025.105917. Epub 2025 Apr 11.
9
Relational machine learning for electronic health record-driven phenotyping.用于电子健康记录驱动的表型分析的关系机器学习。
J Biomed Inform. 2014 Dec;52:260-70. doi: 10.1016/j.jbi.2014.07.007. Epub 2014 Jul 15.
10
Weakly Semi-supervised phenotyping using Electronic Health records.基于电子健康记录的弱监督表型研究
J Biomed Inform. 2022 Oct;134:104175. doi: 10.1016/j.jbi.2022.104175. Epub 2022 Sep 5.

引用本文的文献

1
Rapid identification of inflammatory arthritis and associated adverse events following immune checkpoint therapy: a machine learning approach.免疫检查点治疗后炎症性关节炎及相关不良事件的快速识别:一种机器学习方法。
Front Immunol. 2024 Mar 15;15:1331959. doi: 10.3389/fimmu.2024.1331959. eCollection 2024.
2
Use of machine learning techniques for identifying ischemic stroke instead of the rule-based methods: a nationwide population-based study.使用机器学习技术识别缺血性中风,而非基于规则的方法:一项全国范围内基于人群的研究。
Eur J Med Res. 2024 Jan 3;29(1):6. doi: 10.1186/s40001-023-01594-6.
3
A flexible symbolic regression method for constructing interpretable clinical prediction models.

本文引用的文献

1
Feature extraction for phenotyping from semantic and knowledge resources.从语义和知识资源中进行表型特征提取。
J Biomed Inform. 2019 Mar;91:103122. doi: 10.1016/j.jbi.2019.103122. Epub 2019 Feb 7.
2
Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling.自动化且灵活的复杂疾病识别:使用噪声标记构建系统性红斑狼疮模型。
J Am Med Inform Assoc. 2019 Jan 1;26(1):61-65. doi: 10.1093/jamia/ocy154.
3
A phenotyping algorithm to identify acute ischemic stroke accurately from a national biobank: the Million Veteran Program.
一种用于构建可解释临床预测模型的灵活符号回归方法。
NPJ Digit Med. 2023 Jun 5;6(1):107. doi: 10.1038/s41746-023-00833-8.
4
Mapping of UK Biobank clinical codes: Challenges and possible solutions.英国生物银行临床代码映射:挑战与可能的解决方案。
PLoS One. 2022 Dec 16;17(12):e0275816. doi: 10.1371/journal.pone.0275816. eCollection 2022.
5
Learning and visualizing chronic latent representations using electronic health records.利用电子健康记录学习和可视化慢性潜在表征
BioData Min. 2022 Sep 5;15(1):18. doi: 10.1186/s13040-022-00303-z.
6
Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation.自然语言处理和机器学习在电子健康记录中识别中风事件:算法开发和验证。
J Med Internet Res. 2021 Mar 8;23(3):e22951. doi: 10.2196/22951.
一种从国家生物样本库中准确识别急性缺血性卒中的表型分析算法:百万退伍军人计划
Clin Epidemiol. 2018 Oct 16;10:1509-1521. doi: 10.2147/CLEP.S160764. eCollection 2018.
4
Effect of vocabulary mapping for conditions on phenotype cohorts.条件词汇映射对表型队列的影响。
J Am Med Inform Assoc. 2018 Dec 1;25(12):1618-1625. doi: 10.1093/jamia/ocy124.
5
Novel Prehospital Prediction Model of Large Vessel Occlusion Using Artificial Neural Network.使用人工神经网络的新型大血管闭塞院前预测模型
Front Aging Neurosci. 2018 Jun 26;10:181. doi: 10.3389/fnagi.2018.00181. eCollection 2018.
6
Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes.电子健康记录的深度表型分析有助于通过临床外显子组进行遗传诊断。
Am J Hum Genet. 2018 Jul 5;103(1):58-73. doi: 10.1016/j.ajhg.2018.05.010. Epub 2018 Jun 28.
7
PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies.PheProb:使用诊断代码进行概率表型分析,以提高遗传关联研究的效能。
J Am Med Inform Assoc. 2018 Oct 1;25(10):1359-1365. doi: 10.1093/jamia/ocy056.
8
Disease Heritability Inferred from Familial Relationships Reported in Medical Records.从医疗记录中报告的家族关系推断出的疾病遗传率。
Cell. 2018 Jun 14;173(7):1692-1704.e11. doi: 10.1016/j.cell.2018.04.032. Epub 2018 May 17.
9
Phenotype risk scores identify patients with unrecognized Mendelian disease patterns.表型风险评分可识别出具有未被识别的孟德尔疾病模式的患者。
Science. 2018 Mar 16;359(6381):1233-1239. doi: 10.1126/science.aal4043.
10
Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis.迈向卒中表型分析:利用大规模流行病学研究数据进行卒中诊断检测。
PLoS One. 2018 Feb 14;13(2):e0192586. doi: 10.1371/journal.pone.0192586. eCollection 2018.