• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在儿科疾病的年龄聚类中进行探索性无监督机器学习分析的交流。

Communicating exploratory unsupervised machine learning analysis in age clustering for paediatric disease.

机构信息

DRIVE, Great Ormond Street Hospital for Children, London, UK.

NIHR GOSH BRC, London, UK.

出版信息

BMJ Health Care Inform. 2024 Jul 29;31(1):e100963. doi: 10.1136/bmjhci-2023-100963.

DOI:10.1136/bmjhci-2023-100963
PMID:39074912
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11288139/
Abstract

BACKGROUND

Despite the increasing availability of electronic healthcare record (EHR) data and wide availability of plug-and-play machine learning (ML) Application Programming Interfaces, the adoption of data-driven decision-making within routine hospital workflows thus far, has remained limited. Through the lens of deriving clusters of diagnoses by age, this study investigated the type of ML analysis that can be performed using EHR data and how results could be communicated to lay stakeholders.

METHODS

Observational EHR data from a tertiary paediatric hospital, containing 61 522 unique patients and 3315 unique ICD-10 diagnosis codes was used, after preprocessing. K-means clustering was applied to identify age distributions of patient diagnoses. The final model was selected using quantitative metrics and expert assessment of the clinical validity of the clusters. Additionally, uncertainty over preprocessing decisions was analysed.

FINDINGS

Four age clusters of diseases were identified, broadly aligning to ages between: 0 and 1; 1 and 5; 5 and 13; 13 and 18. Diagnoses, within the clusters, aligned to existing knowledge regarding the propensity of presentation at different ages, and sequential clusters presented known disease progressions. The results validated similar methodologies within the literature. The impact of uncertainty induced by preprocessing decisions was large at the individual diagnoses but not at a population level. Strategies for mitigating, or communicating, this uncertainty were successfully demonstrated.

CONCLUSION

Unsupervised ML applied to EHR data identifies clinically relevant age distributions of diagnoses which can augment existing decision making. However, biases within healthcare datasets dramatically impact results if not appropriately mitigated or communicated.

摘要

背景

尽管电子医疗记录 (EHR) 数据的可用性不断增加,并且广泛提供了即插即用的机器学习 (ML) 应用程序编程接口,但迄今为止,在常规医院工作流程中采用数据驱动的决策仍然受到限制。通过按年龄得出诊断群集的角度,本研究调查了可以使用 EHR 数据执行的 ML 分析类型,以及如何将结果传达给非专业利益相关者。

方法

使用了经过预处理的来自三级儿科医院的观察性 EHR 数据,其中包含 61522 个唯一患者和 3315 个独特的 ICD-10 诊断代码。应用 K-均值聚类来识别患者诊断的年龄分布。最终模型使用定量指标和对群集临床有效性的专家评估进行选择。此外,还分析了预处理决策的不确定性。

结果

确定了四个疾病年龄群集,大致对应于以下年龄段:0 至 1 岁;1 至 5 岁;5 至 13 岁;13 至 18 岁。群集中的诊断与不同年龄段的出现倾向以及连续群集呈现出已知的疾病进展相关的现有知识相符。结果验证了文献中类似的方法。由预处理决策引起的不确定性的影响在个体诊断中很大,但在人群水平上则不然。成功演示了减轻或传达这种不确定性的策略。

结论

应用于 EHR 数据的无监督 ML 可以识别出具有临床相关性的诊断年龄分布,从而增强现有的决策制定。然而,如果不适当减轻或传达,医疗保健数据集内的偏差会极大地影响结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/b33525de4598/bmjhci-31-1-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/ef8d2dead54b/bmjhci-31-1-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/1ea9d19087cc/bmjhci-31-1-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/b33525de4598/bmjhci-31-1-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/ef8d2dead54b/bmjhci-31-1-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/1ea9d19087cc/bmjhci-31-1-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a2/11288139/b33525de4598/bmjhci-31-1-g003.jpg

相似文献

1
Communicating exploratory unsupervised machine learning analysis in age clustering for paediatric disease.在儿科疾病的年龄聚类中进行探索性无监督机器学习分析的交流。
BMJ Health Care Inform. 2024 Jul 29;31(1):e100963. doi: 10.1136/bmjhci-2023-100963.
2
Identifying and evaluating clinical subtypes of Alzheimer's disease in care electronic health records using unsupervised machine learning.利用无监督机器学习在护理电子健康记录中识别和评估阿尔茨海默病的临床亚型。
BMC Med Inform Decis Mak. 2021 Dec 8;21(1):343. doi: 10.1186/s12911-021-01693-6.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Discovering patient groups in sequential electronic healthcare data using unsupervised representation learning.使用无监督表示学习在序贯电子医疗数据中发现患者群体。
BMC Med Inform Decis Mak. 2025 Jan 28;25(1):45. doi: 10.1186/s12911-024-02812-9.
5
Unsupervised machine learning for the discovery of latent disease clusters and patient subgroups using electronic health records.使用电子健康记录进行无监督机器学习以发现潜在疾病集群和患者亚组。
J Biomed Inform. 2020 Feb;102:103364. doi: 10.1016/j.jbi.2019.103364. Epub 2019 Dec 28.
6
A clustering approach for detecting implausible observation values in electronic health records data.一种用于检测电子健康记录数据中不合理观测值的聚类方法。
BMC Med Inform Decis Mak. 2019 Jul 23;19(1):142. doi: 10.1186/s12911-019-0852-6.
7
An unsupervised learning approach to identify immunoglobulin utilization patterns using electronic health records.一种使用电子健康记录识别免疫球蛋白利用模式的无监督学习方法。
Transfusion. 2023 Dec;63(12):2234-2247. doi: 10.1111/trf.17585. Epub 2023 Oct 20.
8
An unsupervised machine learning method for discovering patient clusters based on genetic signatures.基于遗传特征的无监督机器学习方法发现患者聚类。
J Biomed Inform. 2018 Sep;85:30-39. doi: 10.1016/j.jbi.2018.07.004. Epub 2018 Jul 29.
9
Mining tasks and task characteristics from electronic health record audit logs with unsupervised machine learning.使用无监督机器学习从电子健康记录审核日志中挖掘任务和任务特征。
J Am Med Inform Assoc. 2021 Jun 12;28(6):1168-1177. doi: 10.1093/jamia/ocaa338.
10
Unsupervised clustering for sepsis identification in large-scale patient data: a model development and validation study.用于大规模患者数据中脓毒症识别的无监督聚类:一项模型开发与验证研究。
Intensive Care Med Exp. 2025 Mar 20;13(1):37. doi: 10.1186/s40635-025-00744-w.

引用本文的文献

1
Machine Learning in Adolescent Mental Health: Advanced Comorbidity Analysis and Text Mining Insights.青少年心理健康中的机器学习:高级共病分析与文本挖掘见解
Healthcare (Basel). 2025 Aug 29;13(17):2159. doi: 10.3390/healthcare13172159.

本文引用的文献

1
GRAM: Graph-based Attention Model for Healthcare Representation Learning.GRAM:用于医疗保健表示学习的基于图的注意力模型。
KDD. 2017 Aug;2017:787-795. doi: 10.1145/3097983.3098126.
2
PedMap: a pediatric diseases map generated from clinical big data from Hangzhou, China.PedMap:一张基于中国杭州临床大数据生成的儿科疾病图谱。
Sci Rep. 2019 Nov 28;9(1):17867. doi: 10.1038/s41598-019-54439-w.
3
A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service.308 种身心状况的时间图谱,源自英国国民保健署 400 万人的数据。
Lancet Digit Health. 2019 May 20;1(2):e63-e77. doi: 10.1016/S2589-7500(19)30012-3. eCollection 2019 Jun.
4
Soft clustering using real-world data for the identification of multimorbidity patterns in an elderly population: cross-sectional study in a Mediterranean population.使用真实世界数据进行软聚类以识别老年人群中的多种疾病模式:地中海人群的横断面研究。
BMJ Open. 2019 Aug 30;9(8):e029594. doi: 10.1136/bmjopen-2019-029594.
5
Patterns of uveitis in children according to age: comparison of visual outcomes and complications in a tertiary center.儿童葡萄膜炎按年龄的发病模式:三级中心的视力预后与并发症比较
BMC Ophthalmol. 2019 Jun 27;19(1):137. doi: 10.1186/s12886-019-1139-5.
6
Asthma Across Age: Insights From Primary Care.全年龄段哮喘:基层医疗的见解
Front Pediatr. 2019 May 3;7:162. doi: 10.3389/fped.2019.00162. eCollection 2019.
7
Epidemiology of Childhood Hyperthyroidism in France: A Nationwide Population-Based Study.法国儿童甲亢的流行病学:一项全国性基于人群的研究。
J Clin Endocrinol Metab. 2018 Aug 1;103(8):2980-2987. doi: 10.1210/jc.2018-00273.
8
Learning from heterogeneous temporal data in electronic health records.从电子健康记录中的异构时间数据中学习。
J Biomed Inform. 2017 Jan;65:105-119. doi: 10.1016/j.jbi.2016.11.006. Epub 2016 Dec 2.
9
Why Bayesian analysis hasn't caught on in healthcare decision making.为何贝叶斯分析尚未在医疗决策中流行起来。
Int J Technol Assess Health Care. 2001 Winter;17(1):56-66. doi: 10.1017/s026646230110406x.