Suppr超能文献

使用无监督表示学习在序贯电子医疗数据中发现患者群体。

Discovering patient groups in sequential electronic healthcare data using unsupervised representation learning.

作者信息

Li Jingteng, Zakka Kimberley R, Booth John, Rigny Louise, Ray Samiran, Cortina-Borja Mario, Barnaghi Payam, Sebire Neil

机构信息

Great Ormond Street Institute of Child Health, University College London, London, UK.

Data Research Innovation and Virtual Environment, Great Ormond Street Hospital for Children, London, UK.

出版信息

BMC Med Inform Decis Mak. 2025 Jan 28;25(1):45. doi: 10.1186/s12911-024-02812-9.

Abstract

INTRODUCTION

Unsupervised feature learning methods inspired by natural language processing (NLP) models are capable of constructing patient-specific features from longitudinal Electronic Health Records (EHR).

DESIGN

We applied document embedding algorithms to real-world paediatric intensive care (PICU) EHR data to extract patient-specific features from 1853 patients' PICU journeys using 647 unique lab tests and medication events. We evaluated the clinical utility of the patient features via a K-means clustering analysis.

RESULTS

We trained a document embedding model under a unique evaluation pipeline and obtained latent patient feature vectors for all 1853 patients. We performed unsupervised clustering to the patient vectors as a downstream analysis and obtained 5 distinct clusters via hyperparameter optimisation. Significant variations (p<0.0001) within both patient characteristics and surgery intervention and diagnostic profiles were detected.

CONCLUSION

The K-means clustering results demonstrated the clinical utilities of the patient-specific features learned from the embedding algorithms. The latent patient features obtained via the embedding process enabled direct applications of other machine learning algorithms. Future work will focus on utilising the temporal information within EHR and extending EHR embedding algorithms to develop personalised patient journey predictions.

摘要

引言

受自然语言处理(NLP)模型启发的无监督特征学习方法能够从纵向电子健康记录(EHR)中构建患者特异性特征。

设计

我们将文档嵌入算法应用于真实世界的儿科重症监护(PICU)EHR数据,以使用647项独特的实验室检查和用药事件从1853名患者的PICU病程中提取患者特异性特征。我们通过K均值聚类分析评估了患者特征的临床效用。

结果

我们在一个独特的评估管道下训练了一个文档嵌入模型,并为所有1853名患者获得了潜在的患者特征向量。我们对患者向量进行了无监督聚类作为下游分析,并通过超参数优化获得了5个不同的聚类。在患者特征、手术干预和诊断概况方面均检测到显著差异(p<0.0001)。

结论

K均值聚类结果证明了从嵌入算法中学到的患者特异性特征的临床效用。通过嵌入过程获得的潜在患者特征能够直接应用其他机器学习算法。未来的工作将集中在利用EHR中的时间信息以及扩展EHR嵌入算法以开发个性化的患者病程预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76d8/11776155/b8a7e9d6146b/12911_2024_2812_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验