Suppr超能文献

不同电子队列定义对从电子病历中识别房颤患者的影响。

Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record.

机构信息

Division of Cardiovascular Medicine Department of Internal Medicine University of Utah School of Medicine Salt Lake City UT.

Division of Epidemiology Department of Internal Medicine University of Utah School of Medicine Salt Lake City UT.

出版信息

J Am Heart Assoc. 2020 Mar 3;9(5):e014527. doi: 10.1161/JAHA.119.014527. Epub 2020 Feb 26.

Abstract

Background Electronic medical records (EMRs) allow identification of disease-specific patient populations, but varying electronic cohort definitions could result in different populations. We compared the characteristics of an electronic medical record-derived atrial fibrillation (AF) patient population using 5 different electronic cohort definitions. Methods and Results Adult patients with at least 1 AF billing code from January 1, 2010, to December 31, 2017, were included. Based on different electronic cohort definitions, we trained 5 different logistic regression models using a labeled training data set (n=786). Each model yielded a predicted probability; patients were classified as having AF if the probability was higher than a specified cut point. Test characteristics were calculated for each model. These models were then applied to the full cohort and resulting characteristics were compared. In the training set, the comprehensive model (including demographics, billing codes, and natural language processing results) performed best, with an area under the curve of 0.89, sensitivity of 0.90, and specificity of 0.87. Among a candidate population (n=22 000), the proportion of patients identified as having AF varied from 61% in the model using diagnosis or procedure () billing codes to 83% in the model using natural language processing of clinical notes. Among identified AF patients, the proportion of patients with a CHADS-VASc score ≥2 varied from 69% to 85%; oral anticoagulant treatment rates varied from 50% to 66% depending on the model. Conclusions Different electronic cohort definitions result in substantially different AF study samples. This difference threatens the quality and reproducibility of electronic medical record-based research and quality initiatives.

摘要

背景 电子病历(EMR)允许识别特定疾病的患者人群,但不同的电子队列定义可能会导致不同的人群。我们比较了使用 5 种不同电子队列定义的电子病历衍生的心房颤动(AF)患者人群的特征。

方法和结果 纳入至少有 1 次 AF 计费代码的成年患者,时间范围为 2010 年 1 月 1 日至 2017 年 12 月 31 日。基于不同的电子队列定义,我们使用标记的训练数据集(n=786)训练了 5 种不同的逻辑回归模型。每个模型产生一个预测概率;如果概率高于指定的截断点,则患者被归类为患有 AF。为每个模型计算了测试特征。然后将这些模型应用于整个队列,并比较得到的特征。在训练集中,综合模型(包括人口统计学、计费代码和自然语言处理结果)表现最好,曲线下面积为 0.89,灵敏度为 0.90,特异性为 0.87。在候选人群(n=22000)中,使用诊断或程序计费代码的模型确定的 AF 患者比例从 61%到使用临床记录的自然语言处理的模型的 83%不等。在确定的 AF 患者中,CHADS-VASc 评分≥2 的患者比例从 69%到 85%不等;根据模型的不同,口服抗凝治疗率从 50%到 66%不等。

结论 不同的电子队列定义导致 AF 研究样本存在显著差异。这种差异威胁到基于电子病历的研究和质量计划的质量和可重复性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/799d/7335556/7c52bb36fc5d/JAH3-9-e014527-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验