Suppr超能文献

基于电子健康记录的表型分析的贝叶斯潜在类别方法。

A Bayesian latent class approach for EHR-based phenotyping.

机构信息

Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.

Department of Pediatrics, University of Pennsylvania, Philadelphia, Pennsylvania.

出版信息

Stat Med. 2019 Jan 15;38(1):74-87. doi: 10.1002/sim.7953. Epub 2018 Sep 3.

Abstract

Phenotyping, ie, identification of patients possessing a characteristic of interest, is a fundamental task for research conducted using electronic health records. However, challenges to this task include imperfect sensitivity and specificity of clinical codes and inconsistent availability of more detailed data such as laboratory test results. Despite these challenges, most existing electronic health records-derived phenotypes are rule-based, consisting of a series of Boolean arguments informed by expert knowledge of the disease of interest and its coding. The objective of this paper is to introduce a Bayesian latent phenotyping approach that accounts for imperfect data elements and missing not at random missingness patterns that can be used when no gold-standard data are available. We conducted simulation studies to compare alternative phenotyping methods under different patterns of missingness and applied these approaches to a cohort of 68 265 children at elevated risk for type 2 diabetes mellitus (T2DM). In simulation studies, the latent class approach had similar sensitivity to a rule-based approach (95.9% vs 91.9%) while substantially improving specificity (99.7% vs 90.8%). In the PEDSnet cohort, we found that biomarkers and clinical codes were strongly associated with latent T2DM status. The latent T2DM class was also strongly predictive of missingness in biomarkers. Glucose was missing in 83.4% of patients (odds ratio for latent T2DM status = 0.52) while hemoglobin A1c was missing in 91.2% (odds ratio for latent T2DM status = 0.03 ), suggesting missing not at random missingness. The latent phenotype approach may substantially improve on rule-based phenotyping.

摘要

表型分析,即识别具有特定特征的患者,是使用电子健康记录进行研究的基本任务。然而,该任务面临的挑战包括临床代码的敏感性和特异性不完美,以及更详细的数据(如实验室测试结果)的可用性不一致。尽管存在这些挑战,但大多数现有的基于电子健康记录的表型都是基于规则的,由一系列布尔参数组成,这些参数是基于对目标疾病及其编码的专业知识得出的。本文的目的是介绍一种贝叶斯潜在表型分析方法,该方法可以在没有黄金标准数据的情况下,对不完美的数据元素和非随机缺失模式进行处理。我们进行了模拟研究,比较了不同缺失模式下替代表型分析方法的性能,并将这些方法应用于一个 68265 名患有 2 型糖尿病风险升高的儿童队列中。在模拟研究中,潜在类别方法与基于规则的方法具有相似的敏感性(95.9%比 91.9%),而特异性显著提高(99.7%比 90.8%)。在 PEDSnet 队列中,我们发现生物标志物和临床代码与潜在 2 型糖尿病状态密切相关。潜在的 2 型糖尿病类别也强烈预测生物标志物的缺失。葡萄糖在 83.4%的患者中缺失(潜在 2 型糖尿病状态的比值比=0.52),而糖化血红蛋白在 91.2%的患者中缺失(潜在 2 型糖尿病状态的比值比=0.03),表明存在非随机缺失。潜在表型分析方法可能会大大改进基于规则的表型分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11df/6519239/cece4c006119/SIM-38-74-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验