• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模可重现的疾病表型分析:以英国生物库中的冠状动脉疾病为例。

Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank.

机构信息

Institute of Cardiovascular Sciences, University College London, London, United Kingdom.

NIHR University College London Biomedical Research Centre, University College London and University College London Hospitals NHS Foundation Trust, London, United Kingdom.

出版信息

PLoS One. 2022 Apr 5;17(4):e0264828. doi: 10.1371/journal.pone.0264828. eCollection 2022.

DOI:10.1371/journal.pone.0264828
PMID:35381005
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8982857/
Abstract

IMPORTANCE

A lack of internationally agreed standards for combining available data sources at scale risks inconsistent disease phenotyping limiting research reproducibility.

OBJECTIVE

To develop and then evaluate if a rules-based algorithm can identify coronary artery disease (CAD) sub-phenotypes using electronic health records (EHR) and questionnaire data from UK Biobank (UKB).

DESIGN

Case-control and cohort study.

SETTING

Prospective cohort study of 502K individuals aged 40-69 years recruited between 2006-2010 into the UK Biobank with linked hospitalization and mortality data and genotyping.

PARTICIPANTS

We included all individuals for phenotyping into 6 predefined CAD phenotypes using hospital admission and procedure codes, mortality records and baseline survey data. Of these, 408,470 unrelated individuals of European descent had a polygenic risk score (PRS) for CAD estimated.

EXPOSURE

CAD Phenotypes.

MAIN OUTCOMES AND MEASURES

Association with baseline risk factors, mortality (n = 14,419 over 7.8 years median f/u), and a PRS for CAD.

RESULTS

The algorithm classified individuals with CAD into prevalent MI (n = 4,900); incident MI (n = 4,621), prevalent CAD without MI (n = 10,910), incident CAD without MI (n = 8,668), prevalent self-reported MI (n = 2,754); prevalent self-reported CAD without MI (n = 5,623), yielding 37,476 individuals with any type of CAD. Risk factors were similar across the six CAD phenotypes, except for fewer men in the self-reported CAD without MI group (46.7% v 70.1% for the overall group). In age- and sex- adjusted survival analyses, mortality was highest following incident MI (HR 6.66, 95% CI 6.07-7.31) and lowest for prevalent self-reported CAD without MI at baseline (HR 1.31, 95% CI 1.15-1.50) compared to disease-free controls. There were similar graded associations across the six phenotypes per SD increase in PRS, with the strongest association for prevalent MI (OR 1.50, 95% CI 1.46-1.55) and the weakest for prevalent self-reported CAD without MI (OR 1.08, 95% CI 1.05-1.12). The algorithm is available in the open phenotype HDR UK phenotype library (https://portal.caliberresearch.org/).

CONCLUSIONS

An algorithmic, EHR-based approach distinguished six phenotypes of CAD with distinct survival and PRS associations, supporting adoption of open approaches to help standardize CAD phenotyping and its wider potential value for reproducible research in other conditions.

摘要

重要性

缺乏用于大规模整合现有数据源的国际公认标准,可能会导致疾病表型不一致,从而限制研究的可重复性。

目的

开发一种基于规则的算法,并评估其是否可以使用电子健康记录 (EHR) 和英国生物库 (UKB) 的问卷调查数据来识别冠心病 (CAD) 的亚表型。

设计

病例对照和队列研究。

设置

前瞻性队列研究,纳入了 2006-2010 年间招募的 502K 名年龄在 40-69 岁之间的个体,这些个体均有与住院和死亡数据以及基因分型相关联的 EHR。

参与者

我们使用住院和手术代码、死亡记录和基线调查数据,将所有个体分为 6 种预先定义的 CAD 表型进行表型分析。其中,408470 名无亲缘关系的欧洲血统个体的 CAD 多基因风险评分 (PRS) 被估算。

暴露

CAD 表型。

主要结局和措施

与基线风险因素、死亡率(n = 14419 人,中位随访 7.8 年)和 CAD 的 PRS 的相关性。

结果

该算法将 CAD 患者分为:现患心梗(n = 4900)、新发心梗(n = 4621)、现患无心梗的 CAD(n = 10910)、新发无心梗的 CAD(n = 8668)、现患自述心梗(n = 2754)、现患自述无心梗的 CAD(n = 5623),共有 37476 名个体患有任何类型的 CAD。在六个 CAD 表型中,除了自述无心梗 CAD 组的男性比例较低(整体组为 70.1%,自述无心梗 CAD 组为 46.7%)外,其他风险因素相似。在年龄和性别调整的生存分析中,与无疾病对照组相比,新发心梗后的死亡率最高(HR 6.66,95%CI 6.07-7.31),而基线时现患自述无心梗的 CAD 死亡率最低(HR 1.31,95%CI 1.15-1.50)。在每个 PRS 标准差的递增中,六个表型都存在类似的等级关联,其中现患心梗的关联最强(OR 1.50,95%CI 1.46-1.55),而现患自述无心梗的 CAD 最弱(OR 1.08,95%CI 1.05-1.12)。该算法可在开放的表型 HDR UK 表型库(https://portal.caliberresearch.org/)中获得。

结论

基于 EHR 的算法区分了具有不同生存和 PRS 关联的六种 CAD 表型,支持采用开放方法来帮助标准化 CAD 表型及其在其他条件下进行可重复研究的更大潜在价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/e4d7d321f9e8/pone.0264828.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/0db05ecd6187/pone.0264828.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/da180f9e1020/pone.0264828.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/c8df4ad7cf77/pone.0264828.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/4290576a03a4/pone.0264828.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/e4d7d321f9e8/pone.0264828.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/0db05ecd6187/pone.0264828.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/da180f9e1020/pone.0264828.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/c8df4ad7cf77/pone.0264828.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/4290576a03a4/pone.0264828.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b54/8982857/e4d7d321f9e8/pone.0264828.g005.jpg

相似文献

1
Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank.大规模可重现的疾病表型分析:以英国生物库中的冠状动脉疾病为例。
PLoS One. 2022 Apr 5;17(4):e0264828. doi: 10.1371/journal.pone.0264828. eCollection 2022.
2
Predictive Utility of a Coronary Artery Disease Polygenic Risk Score in Primary Prevention.冠心病多基因风险评分在一级预防中的预测效用。
JAMA Cardiol. 2023 Feb 1;8(2):130-137. doi: 10.1001/jamacardio.2022.4466.
3
Contribution of Lipoprotein(a) to Polygenic Risk Prediction of Coronary Artery Disease: A Prospective UK Biobank Analysis.载脂蛋白(a)对冠状动脉疾病多基因风险预测的贡献:一项英国生物库前瞻性分析。
Circ Genom Precis Med. 2023 Oct;16(5):470-477. doi: 10.1161/CIRCGEN.123.004137. Epub 2023 Sep 27.
4
Genetic and Lifestyle Risks for Coronary Artery Disease and Long-Term Risk of Incident Dementia Subtypes.冠状动脉疾病的遗传和生活方式风险以及痴呆症亚型的长期发病风险
Circulation. 2025 Apr 29;151(17):1235-1247. doi: 10.1161/CIRCULATIONAHA.124.070632. Epub 2025 Apr 4.
5
Modification of coronary artery disease clinical risk factors by coronary artery disease polygenic risk score.冠状动脉疾病多基因风险评分对冠状动脉疾病临床风险因素的修正。
Med. 2024 May 10;5(5):459-468.e3. doi: 10.1016/j.medj.2024.02.015. Epub 2024 Apr 19.
6
Polygenic risk scores for coronary artery disease and subsequent event risk amongst established cases.多基因风险评分与已确诊冠心病患者的后续发病风险。
Hum Mol Genet. 2020 May 28;29(8):1388-1395. doi: 10.1093/hmg/ddaa052.
7
Genetic Susceptibility to Hidradenitis Suppurativa and Predisposition to Cardiometabolic Disease.化脓性汗腺炎的遗传易感性与心血管代谢疾病的易感性
JAMA Dermatol. 2025 Jan 1;161(1):22-30. doi: 10.1001/jamadermatol.2024.3779.
8
How group structure impacts the numbers at risk for coronary artery disease: polygenic risk scores and nongenetic risk factors in the UK Biobank cohort.群体结构如何影响冠心病的风险人群数量:英国生物库队列中的多基因风险评分和非遗传风险因素。
Genetics. 2024 Jul 8;227(3). doi: 10.1093/genetics/iyae086.
9
Evaluation of a machine learning-based metabolic marker for coronary artery disease in the UK Biobank.在英国生物银行中基于机器学习的冠状动脉疾病代谢标志物评估。
Atherosclerosis. 2025 Feb;401:119103. doi: 10.1016/j.atherosclerosis.2024.119103. Epub 2024 Dec 18.
10
Joint association of genetic risk and accelerometer-measured physical activity with incident coronary artery disease in the UK biobank cohort.英国生物银行队列中基因风险与加速度计测量的身体活动与冠心病发病的联合关联。
PLoS One. 2024 Jun 13;19(6):e0304653. doi: 10.1371/journal.pone.0304653. eCollection 2024.

引用本文的文献

1
Multi-domain rule-based phenotyping algorithms enable improved GWAS signal.基于多领域规则的表型分析算法可增强全基因组关联研究(GWAS)信号。
NPJ Digit Med. 2025 Aug 2;8(1):499. doi: 10.1038/s41746-025-01815-8.
2
A computational framework for defining and validating reproducible phenotyping algorithms of 313 diseases in the UK Biobank.一种用于定义和验证英国生物银行中313种疾病的可重复表型分析算法的计算框架。
Sci Rep. 2025 Jul 9;15(1):24607. doi: 10.1038/s41598-025-05838-9.

本文引用的文献

1
Accuracy of identifying incident stroke cases from linked health care data in UK Biobank.利用英国生物库中关联的医疗保健数据识别新发中风病例的准确性。
Neurology. 2020 Aug 11;95(6):e697-e707. doi: 10.1212/WNL.0000000000009924. Epub 2020 Jul 2.
2
RETRACTED: Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis.撤回:羟氯喹或氯喹联合或不联合大环内酯类药物治疗新型冠状病毒肺炎:一项多国注册分析
Lancet. 2020 May 22. doi: 10.1016/S0140-6736(20)31180-6.
3
Association of troponin level and age with mortality in 250 000 patients: cohort study across five UK acute care centres.
在 250000 名患者中,肌钙蛋白水平和年龄与死亡率的关联:五个英国急性护理中心的队列研究。
BMJ. 2019 Nov 20;367:l6055. doi: 10.1136/bmj.l6055.
4
A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service.308 种身心状况的时间图谱,源自英国国民保健署 400 万人的数据。
Lancet Digit Health. 2019 May 20;1(2):e63-e77. doi: 10.1016/S2589-7500(19)30012-3. eCollection 2019 Jun.
5
2019 ESC Guidelines for the diagnosis and management of chronic coronary syndromes.2019年欧洲心脏病学会慢性冠状动脉综合征诊断和管理指南
Eur Heart J. 2020 Jan 14;41(3):407-477. doi: 10.1093/eurheartj/ehz425.
6
The "All of Us" Research Program.“All of Us”研究计划。
N Engl J Med. 2019 Aug 15;381(7):668-676. doi: 10.1056/NEJMsr1809937.
7
UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER.英国表型组学平台用于开发和验证电子健康记录表型:CALIBER。
J Am Med Inform Assoc. 2019 Dec 1;26(12):1545-1559. doi: 10.1093/jamia/ocz105.
8
The reproducibility crisis in the age of digital medicine.数字医学时代的可重复性危机。
NPJ Digit Med. 2019 Jan 29;2:2. doi: 10.1038/s41746-019-0079-z. eCollection 2019.
9
Identifying dementia outcomes in UK Biobank: a validation study of primary care, hospital admissions and mortality data.在英国生物样本库中识别痴呆症结局:初级保健、住院和死亡率数据的验证研究。
Eur J Epidemiol. 2019 Jun;34(6):557-565. doi: 10.1007/s10654-019-00499-1. Epub 2019 Feb 26.
10
Fourth Universal Definition of Myocardial Infarction (2018).心肌梗死的第四次全球定义(2018年)。
Circulation. 2018 Nov 13;138(20):e618-e651. doi: 10.1161/CIR.0000000000000617.