Suppr超能文献

应用无监督深度学习算法从电子病历数据中识别慢性咳嗽患者的特定亚群。

Application of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR data.

机构信息

Indiana University School of Medicine, 1101 W 10th Street, Indianapolis, IN, 46202, USA.

Purdue School of Engineering and Technology, IUPUI, ET 301L, 799 W. Michigan Street, Indianapolis, IN, 46202, USA.

出版信息

BMC Bioinformatics. 2022 Apr 19;23(Suppl 3):140. doi: 10.1186/s12859-022-04680-4.

Abstract

BACKGROUND

Chronic cough affects approximately 10% of adults. The lack of ICD codes for chronic cough makes it challenging to apply supervised learning methods to predict the characteristics of chronic cough patients, thereby requiring the identification of chronic cough patients by other mechanisms. We developed a deep clustering algorithm with auto-encoder embedding (DCAE) to identify clusters of chronic cough patients based on data from a large cohort of 264,146 patients from the Electronic Medical Records (EMR) system. We constructed features using the diagnosis within the EMR, then built a clustering-oriented loss function directly on embedded features of the deep autoencoder to jointly perform feature refinement and cluster assignment. Lastly, we performed statistical analysis on the identified clusters to characterize the chronic cough patients compared to the non-chronic cough patients.

RESULTS

The experimental results show that the DCAE model generated three chronic cough clusters and one non-chronic cough patient cluster. We found various diagnoses, medications, and lab tests highly associated with chronic cough patients by comparing the chronic cough cluster with the non-chronic cough cluster. Comparison of chronic cough clusters demonstrated that certain combinations of medications and diagnoses characterize some chronic cough clusters.

CONCLUSIONS

To the best of our knowledge, this study is the first to test the potential of unsupervised deep learning methods for chronic cough investigation, which also shows a great advantage over existing algorithms for patient data clustering.

摘要

背景

慢性咳嗽影响大约 10%的成年人。缺乏慢性咳嗽的 ICD 编码使得应用监督学习方法来预测慢性咳嗽患者的特征变得具有挑战性,因此需要通过其他机制来识别慢性咳嗽患者。我们开发了一种具有自动编码器嵌入的深度聚类算法(DCAE),以根据来自电子病历(EMR)系统的 264146 名患者的大型队列数据识别慢性咳嗽患者的聚类。我们使用 EMR 中的诊断构建特征,然后在深度自动编码器的嵌入特征上构建面向聚类的损失函数,以共同执行特征细化和聚类分配。最后,我们对识别出的聚类进行统计分析,以将慢性咳嗽患者与非慢性咳嗽患者进行特征比较。

结果

实验结果表明,DCAE 模型生成了三个慢性咳嗽聚类和一个非慢性咳嗽患者聚类。通过将慢性咳嗽聚类与非慢性咳嗽聚类进行比较,我们发现了与慢性咳嗽患者高度相关的各种诊断、药物和实验室检查。慢性咳嗽聚类的比较表明,某些药物和诊断的组合可以描述某些慢性咳嗽聚类。

结论

据我们所知,这项研究首次测试了无监督深度学习方法在慢性咳嗽研究中的潜力,这也显示了其在患者数据聚类方面相对于现有算法的巨大优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ddf/9019947/b308adfcf4ec/12859_2022_4680_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验