Suppr超能文献

基于药物治疗的机器学习在电子健康记录数据库中识别小儿血液透析患者亚群

Medication based machine learning to identify subpopulations of pediatric hemodialysis patients in an electronic health record database.

作者信息

McKnite Autumn M, Job Kathleen M, Nelson Raoul, Sherwin Catherine M T, Watt Kevin M, Brewer Simon C

机构信息

Department of Pharmacology and Toxicology, College of Pharmacy, University of Utah, Salt Lake City, Ut, USA.

Division of Clinical Pharmacology, Department of Pediatrics, The University of Utah, Salt Lake City, UT, USA.

出版信息

Inform Med Unlocked. 2022;34. doi: 10.1016/j.imu.2022.101104. Epub 2022 Oct 6.

Abstract

Electronic health records (EHRs) have given rise to large and complex databases of medical information that have the potential to become powerful tools for clinical research. However, differences in coding systems and the detail and accuracy of the information within EHRs can vary across institutions. This makes it challenging to identify subpopulations of patients and limits the widespread use of multi-institutional databases. In this study, we leveraged machine learning to identify patterns in medication usage among hospitalized pediatric patients receiving renal replacement therapy and created a predictive model that successfully differentiated between intermittent (iHD) and continuous renal replacement therapy (CRRT) hemodialysis patients. We trained six machine learning algorithms (logistical regression, Naïve Bayes, -nearest neighbor, support vector machine, random forest, and gradient boosted trees) using patient records from a multi-center database ( = 533) and prescribed medication ingredients ( = 228) as features to discriminate between the two hemodialysis types. Predictive skill was assessed using a 5-fold cross-validation, and the algorithms showed a range of performance from 0.7 balanced accuracy (logistical regression) to 0.86 (random forest). The two best performing models were further tested using an independent single-center dataset and achieved 84-87% balanced accuracy. This model overcomes issues inherent within large databases and will allow us to utilize and combine historical records, significantly increasing population size and diversity within both iHD and CRRT populations for future clinical studies. Our work demonstrates the utility of using medications alone to accurately differentiate subpopulations of patients in large datasets, allowing codes to be transferred between different coding systems. This framework has the potential to be used to distinguish other subpopulations of patients where discriminatory ICD codes are not available, permitting more detailed insights and new lines of research.

摘要

电子健康记录(EHRs)催生了庞大而复杂的医学信息数据库,这些数据库有可能成为临床研究的强大工具。然而,编码系统的差异以及EHRs中信息的详细程度和准确性在不同机构间可能有所不同。这使得识别患者亚群具有挑战性,并限制了多机构数据库的广泛使用。在本研究中,我们利用机器学习来识别接受肾脏替代治疗的住院儿科患者的用药模式,并创建了一个预测模型,该模型成功地区分了间歇性血液透析(iHD)和连续性肾脏替代治疗(CRRT)患者。我们使用来自多中心数据库(n = 533)的患者记录和规定的药物成分(n = 228)作为特征,训练了六种机器学习算法(逻辑回归、朴素贝叶斯、k近邻、支持向量机、随机森林和梯度提升树),以区分这两种血液透析类型。使用五折交叉验证评估预测技能,这些算法的性能范围从0.7的平衡准确率(逻辑回归)到0.86(随机森林)。使用独立的单中心数据集对两个表现最佳的模型进行了进一步测试,平衡准确率达到了84 - 8

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6162/9674326/a9ed647a9035/nihms-1849177-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验