• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用无监督表示学习在序贯电子医疗数据中发现患者群体。

Discovering patient groups in sequential electronic healthcare data using unsupervised representation learning.

作者信息

Li Jingteng, Zakka Kimberley R, Booth John, Rigny Louise, Ray Samiran, Cortina-Borja Mario, Barnaghi Payam, Sebire Neil

机构信息

Great Ormond Street Institute of Child Health, University College London, London, UK.

Data Research Innovation and Virtual Environment, Great Ormond Street Hospital for Children, London, UK.

出版信息

BMC Med Inform Decis Mak. 2025 Jan 28;25(1):45. doi: 10.1186/s12911-024-02812-9.

DOI:10.1186/s12911-024-02812-9
PMID:39875929
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11776155/
Abstract

INTRODUCTION

Unsupervised feature learning methods inspired by natural language processing (NLP) models are capable of constructing patient-specific features from longitudinal Electronic Health Records (EHR).

DESIGN

We applied document embedding algorithms to real-world paediatric intensive care (PICU) EHR data to extract patient-specific features from 1853 patients' PICU journeys using 647 unique lab tests and medication events. We evaluated the clinical utility of the patient features via a K-means clustering analysis.

RESULTS

We trained a document embedding model under a unique evaluation pipeline and obtained latent patient feature vectors for all 1853 patients. We performed unsupervised clustering to the patient vectors as a downstream analysis and obtained 5 distinct clusters via hyperparameter optimisation. Significant variations (p<0.0001) within both patient characteristics and surgery intervention and diagnostic profiles were detected.

CONCLUSION

The K-means clustering results demonstrated the clinical utilities of the patient-specific features learned from the embedding algorithms. The latent patient features obtained via the embedding process enabled direct applications of other machine learning algorithms. Future work will focus on utilising the temporal information within EHR and extending EHR embedding algorithms to develop personalised patient journey predictions.

摘要

引言

受自然语言处理(NLP)模型启发的无监督特征学习方法能够从纵向电子健康记录(EHR)中构建患者特异性特征。

设计

我们将文档嵌入算法应用于真实世界的儿科重症监护(PICU)EHR数据,以使用647项独特的实验室检查和用药事件从1853名患者的PICU病程中提取患者特异性特征。我们通过K均值聚类分析评估了患者特征的临床效用。

结果

我们在一个独特的评估管道下训练了一个文档嵌入模型,并为所有1853名患者获得了潜在的患者特征向量。我们对患者向量进行了无监督聚类作为下游分析,并通过超参数优化获得了5个不同的聚类。在患者特征、手术干预和诊断概况方面均检测到显著差异(p<0.0001)。

结论

K均值聚类结果证明了从嵌入算法中学到的患者特异性特征的临床效用。通过嵌入过程获得的潜在患者特征能够直接应用其他机器学习算法。未来的工作将集中在利用EHR中的时间信息以及扩展EHR嵌入算法以开发个性化的患者病程预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/8175691c4615/12911_2024_2812_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/b8a7e9d6146b/12911_2024_2812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/0ea10d9afa5d/12911_2024_2812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/a55fe572af66/12911_2024_2812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/c85b7fc46374/12911_2024_2812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/06fd8c9b1f2a/12911_2024_2812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/1094f19a3416/12911_2024_2812_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/8175691c4615/12911_2024_2812_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/b8a7e9d6146b/12911_2024_2812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/0ea10d9afa5d/12911_2024_2812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/a55fe572af66/12911_2024_2812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/c85b7fc46374/12911_2024_2812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/06fd8c9b1f2a/12911_2024_2812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/1094f19a3416/12911_2024_2812_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/8175691c4615/12911_2024_2812_Fig7_HTML.jpg

相似文献

1
Discovering patient groups in sequential electronic healthcare data using unsupervised representation learning.使用无监督表示学习在序贯电子医疗数据中发现患者群体。
BMC Med Inform Decis Mak. 2025 Jan 28;25(1):45. doi: 10.1186/s12911-024-02812-9.
2
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.
3
Prediction task guided representation learning of medical codes in EHR.基于预测任务的电子健康记录中医疗编码的表示学习。
J Biomed Inform. 2018 Aug;84:1-10. doi: 10.1016/j.jbi.2018.06.013. Epub 2018 Jun 19.
4
Disease Concept-Embedding Based on the Self-Supervised Method for Medical Information Extraction from Electronic Health Records and Disease Retrieval: Algorithm Development and Validation Study.基于自监督方法的疾病概念嵌入在电子健康记录中的医学信息提取和疾病检索:算法开发和验证研究。
J Med Internet Res. 2021 Jan 27;23(1):e25113. doi: 10.2196/25113.
5
Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records.结合无监督、监督和基于规则的学习:以电子健康记录中检测患者过敏为例。
BMC Med Inform Decis Mak. 2023 Sep 18;23(1):188. doi: 10.1186/s12911-023-02271-8.
6
Identifying and evaluating clinical subtypes of Alzheimer's disease in care electronic health records using unsupervised machine learning.利用无监督机器学习在护理电子健康记录中识别和评估阿尔茨海默病的临床亚型。
BMC Med Inform Decis Mak. 2021 Dec 8;21(1):343. doi: 10.1186/s12911-021-01693-6.
7
Unsupervised ensemble ranking of terms in electronic health record notes based on their importance to patients.基于术语对患者的重要性对电子健康记录笔记中的术语进行无监督集成排序。
J Biomed Inform. 2017 Apr;68:121-131. doi: 10.1016/j.jbi.2017.02.016. Epub 2017 Mar 4.
8
Communicating exploratory unsupervised machine learning analysis in age clustering for paediatric disease.在儿科疾病的年龄聚类中进行探索性无监督机器学习分析的交流。
BMJ Health Care Inform. 2024 Jul 29;31(1):e100963. doi: 10.1136/bmjhci-2023-100963.
9
ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis.ARCH:通过聚合叙事编码健康记录分析构建大规模知识图谱
medRxiv. 2023 May 21:2023.05.14.23289955. doi: 10.1101/2023.05.14.23289955.
10
SemOntoMap: A Hybrid Approach for Semantic Annotation of Clinical Texts.SemOntoMap:一种临床文本语义标注的混合方法。
Stud Health Technol Inform. 2024 Aug 22;316:1839-1843. doi: 10.3233/SHTI240789.

本文引用的文献

1
Agranulocytosis Secondary to Cancer Chemotherapy Associated With Higher In-Hospital Mortality in Patients With Central Line Insertion During a Hospital Stay.癌症化疗继发的粒细胞缺乏症与住院期间行中心静脉置管的患者较高的院内死亡率相关。
Cureus. 2023 Feb 7;15(2):e34717. doi: 10.7759/cureus.34717. eCollection 2023 Feb.
2
Leveraging electronic health records for data science: common pitfalls and how to avoid them.利用电子健康记录进行数据科学:常见陷阱及规避方法。
Lancet Digit Health. 2022 Dec;4(12):e893-e898. doi: 10.1016/S2589-7500(22)00154-6. Epub 2022 Sep 22.
3
Digital medicine and the curse of dimensionality.
数字医学与维度诅咒
NPJ Digit Med. 2021 Oct 28;4(1):153. doi: 10.1038/s41746-021-00521-5.
4
Why Is the Electronic Health Record So Challenging for Research and Clinical Care?电子健康记录为何对研究和临床护理极具挑战性?
Methods Inf Med. 2021 May;60(1-02):32-48. doi: 10.1055/s-0041-1731784. Epub 2021 Jul 19.
5
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.医学BERT:基于大规模结构化电子健康记录进行疾病预测的预训练上下文嵌入模型
NPJ Digit Med. 2021 May 20;4(1):86. doi: 10.1038/s41746-021-00455-y.
6
BEHRT: Transformer for Electronic Health Records.BEHRT:电子健康记录的转换器。
Sci Rep. 2020 Apr 28;10(1):7155. doi: 10.1038/s41598-020-62922-y.
7
The potential for artificial intelligence in healthcare.人工智能在医疗保健领域的潜力。
Future Healthc J. 2019 Jun;6(2):94-98. doi: 10.7861/futurehosp.6-2-94.
8
Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time.通过捕捉疾病随时间的进展实现医疗保健领域的可解释表示学习
KDD. 2018 Aug;2018:43-51. doi: 10.1145/3219819.3219904.
9
Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.利用电子健康记录数据开发深度学习模型的机遇与挑战:系统综述。
J Am Med Inform Assoc. 2018 Oct 1;25(10):1419-1428. doi: 10.1093/jamia/ocy068.
10
Patient similarity for precision medicine: A systematic review.精准医学中的患者相似性:系统评价。
J Biomed Inform. 2018 Jul;83:87-96. doi: 10.1016/j.jbi.2018.06.001. Epub 2018 Jun 1.