• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用易错算法衍生的表型:增强电子健康记录数据中风险因素的关联研究。

Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data.

机构信息

Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; The Graduate Group in Applied Mathematics and Computational Science, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA.

Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

J Biomed Inform. 2024 Sep;157:104690. doi: 10.1016/j.jbi.2024.104690. Epub 2024 Jul 14.

DOI:10.1016/j.jbi.2024.104690
PMID:39004110
Abstract

OBJECTIVES

It has become increasingly common for multiple computable phenotypes from electronic health records (EHR) to be developed for a given phenotype. However, EHR-based association studies often focus on a single phenotype. In this paper, we develop a method aiming to simultaneously make use of multiple EHR-derived phenotypes for reduction of bias due to phenotyping error and improved efficiency of phenotype/exposure associations.

MATERIALS AND METHODS

The proposed method combines multiple algorithm-derived phenotypes with a small set of validated outcomes to reduce bias and improve estimation accuracy and efficiency. The performance of our method was evaluated through simulation studies and real-world application to an analysis of colon cancer recurrence using EHR data from Kaiser Permanente Washington.

RESULTS

In settings where there was no single surrogate performing uniformly better than all others in terms of both sensitivity and specificity, our method achieved substantial bias reduction compared to using a single algorithm-derived phenotype. Our method also led to higher estimation efficiency by up to 30% compared to an estimator that used only one algorithm-derived phenotype.

DISCUSSION

Simulation studies and application to real-world data demonstrated the effectiveness of our method in integrating multiple phenotypes, thereby enhancing bias reduction, statistical accuracy and efficiency.

CONCLUSIONS

Our method combines information across multiple surrogates using a statistically efficient seemingly unrelated regression framework. Our method provides a robust alternative to single-surrogate-based bias correction, especially in contexts lacking information on which surrogate is superior.

摘要

目的

从电子健康记录(EHR)中为给定的表型开发多个可计算表型已变得越来越普遍。然而,基于 EHR 的关联研究通常集中在单个表型上。在本文中,我们开发了一种方法,旨在同时利用多个基于 EHR 的表型来减少表型错误引起的偏差,并提高表型/暴露关联的效率。

材料和方法

该方法将多个算法衍生表型与一小部分验证的结果相结合,以减少偏差并提高估计的准确性和效率。通过模拟研究和使用 Kaiser Permanente Washington 的 EHR 数据对结肠癌复发的分析进行的实际应用,评估了我们方法的性能。

结果

在没有单个替代物在敏感性和特异性方面都普遍优于所有其他替代物的情况下,与使用单个算法衍生表型相比,我们的方法实现了显著的偏差减少。与仅使用一种算法衍生表型的估计器相比,我们的方法还提高了高达 30%的估计效率。

讨论

模拟研究和对真实数据的应用表明,我们的方法在整合多个表型方面是有效的,从而增强了偏差减少、统计准确性和效率。

结论

我们的方法使用统计上有效的看似不相关回归框架结合了多个替代物的信息。我们的方法为基于单个替代物的偏差校正提供了一种稳健的替代方法,特别是在缺乏有关哪个替代物更优的信息的情况下。

相似文献

1
Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data.利用易错算法衍生的表型:增强电子健康记录数据中风险因素的关联研究。
J Biomed Inform. 2024 Sep;157:104690. doi: 10.1016/j.jbi.2024.104690. Epub 2024 Jul 14.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Leveraging undecided cases in chart-reviewed phenotypes to enhance EHR-based association studies.利用图表审查表型中的不确定病例来加强基于电子健康记录的关联研究。
J Biomed Inform. 2025 Jun;166:104839. doi: 10.1016/j.jbi.2025.104839. Epub 2025 Apr 30.
4
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
5
Evaluating the Bias, type I error and statistical power of the prior Knowledge-Guided integrated likelihood estimation (PIE) for bias reduction in EHR based association studies.评估用于减少基于电子健康记录(EHR)的关联研究中偏差的先验知识引导综合似然估计(PIE)的偏差、I型错误和统计功效。
J Biomed Inform. 2025 Mar;163:104787. doi: 10.1016/j.jbi.2025.104787. Epub 2025 Feb 2.
6
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理:一项网络荟萃分析。
Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.
7
Conservative, physical and surgical interventions for managing faecal incontinence and constipation in adults with central neurological diseases.保守治疗、物理治疗和手术干预用于治疗伴有中枢神经系统疾病的成年人的粪便失禁和便秘。
Cochrane Database Syst Rev. 2024 Oct 29;10(10):CD002115. doi: 10.1002/14651858.CD002115.pub6.
8
Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。
Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.
9
Interventions targeted at women to encourage the uptake of cervical screening.针对女性的干预措施,以鼓励她们接受宫颈癌筛查。
Cochrane Database Syst Rev. 2021 Sep 6;9(9):CD002834. doi: 10.1002/14651858.CD002834.pub3.
10
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

本文引用的文献

1
Large language models facilitate the generation of electronic health record phenotyping algorithms.大语言模型有助于电子健康记录表型算法的生成。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1994-2001. doi: 10.1093/jamia/ocae072.
2
Machine learning approaches for electronic health records phenotyping: a methodical review.基于机器学习的电子健康记录表型分析方法:系统评价
J Am Med Inform Assoc. 2023 Jan 18;30(2):367-381. doi: 10.1093/jamia/ocac216.
3
A cost-effective chart review sampling design to account for phenotyping error in electronic health records (EHR) data.
一种具有成本效益的图表审查抽样设计,用于解决电子健康记录 (EHR) 数据中的表型错误。
J Am Med Inform Assoc. 2021 Dec 28;29(1):52-61. doi: 10.1093/jamia/ocab222.
4
PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records.PheMap:一个用于电子健康记录中高通量表型分析的多资源知识库。
J Am Med Inform Assoc. 2020 Nov 1;27(11):1675-1687. doi: 10.1093/jamia/ocaa104.
5
Changes in use of opioid therapy after colon cancer diagnosis: a population-based study.结直肠癌诊断后阿片类药物治疗使用的变化:一项基于人群的研究。
Cancer Causes Control. 2019 Dec;30(12):1341-1350. doi: 10.1007/s10552-019-01236-5. Epub 2019 Oct 30.
6
An augmented estimation procedure for EHR-based association studies accounting for differential misclassification.基于电子健康记录的关联研究的增强估计程序,考虑到差异误诊。
J Am Med Inform Assoc. 2020 Feb 1;27(2):244-253. doi: 10.1093/jamia/ocz180.
7
Cardiovascular medication use and risks of colon cancer recurrences and additional cancer events: a cohort study.心血管药物使用与结肠癌复发和其他癌症事件风险:一项队列研究。
BMC Cancer. 2019 Mar 27;19(1):270. doi: 10.1186/s12885-019-5493-8.
8
Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence.由于电子病历衍生结局的差异误分类导致 I 类错误率膨胀:基于乳腺癌复发的实证说明。
Pharmacoepidemiol Drug Saf. 2019 Feb;28(2):264-268. doi: 10.1002/pds.4680. Epub 2018 Oct 30.
9
Semi-supervised validation of multiple surrogate outcomes with application to electronic medical records phenotyping.多替代结局的半监督验证及其在电子病历表型分析中的应用
Biometrics. 2019 Mar;75(1):78-89. doi: 10.1111/biom.12971. Epub 2019 Mar 8.
10
Detecting Lung and Colorectal Cancer Recurrence Using Structured Clinical/Administrative Data to Enable Outcomes Research and Population Health Management.利用结构化临床/管理数据检测肺癌和结直肠癌复发,以推动结果研究和人群健康管理。
Med Care. 2017 Dec;55(12):e88-e98. doi: 10.1097/MLR.0000000000000404.