Suppr超能文献

基于汇总统计数据的多基因预测高估了预测结果。

Overestimated prediction using polygenic prediction derived from summary statistics.

机构信息

Department of Biomedical Engineering, Columbia University, New York, USA.

Department of Applied Mathematics & Statistics, Stony Brook University, New York, USA.

出版信息

BMC Genom Data. 2023 Sep 14;24(1):52. doi: 10.1186/s12863-023-01151-4.

Abstract

BACKGROUND

When polygenic risk score (PRS) is derived from summary statistics, independence between discovery and test sets cannot be monitored. We compared two types of PRS studies derived from raw genetic data (denoted as rPRS) and the summary statistics for IGAP (sPRS).

RESULTS

Two variables with the high heritability in UK Biobank, hypertension, and height, are used to derive an exemplary scale effect of PRS. sPRS without APOE is derived from International Genomics of Alzheimer's Project (IGAP), which records ΔAUC and ΔR of 0.051 ± 0.013 and 0.063 ± 0.015 for Alzheimer's Disease Sequencing Project (ADSP) and 0.060 and 0.086 for Accelerating Medicine Partnership - Alzheimer's Disease (AMP-AD). On UK Biobank, rPRS performances for hypertension assuming a similar size of discovery and test sets are 0.0036 ± 0.0027 (ΔAUC) and 0.0032 ± 0.0028 (ΔR). For height, ΔR is 0.029 ± 0.0037.

CONCLUSION

Considering the high heritability of hypertension and height of UK Biobank and sample size of UK Biobank, sPRS results from AD databases are inflated. Independence between discovery and test sets is a well-known basic requirement for PRS studies. However, a lot of PRS studies cannot follow such requirements because of impossible direct comparisons when using summary statistics. Thus, for sPRS, potential duplications should be carefully considered within the same ethnic group.

摘要

背景

当多基因风险评分(PRS)源自汇总统计数据时,无法监测发现集和测试集之间的独立性。我们比较了两种源自原始遗传数据的 PRS 研究(表示为 rPRS)和用于 IGAP 的汇总统计数据(sPRS)。

结果

使用 UK Biobank 中具有高遗传度的两个变量,高血压和身高,推导出 PRS 的典型规模效应。没有 APOE 的 sPRS 源自国际阿尔茨海默病基因组计划(IGAP),该计划记录了阿尔茨海默病测序计划(ADSP)的 0.051±0.013 和 0.063±0.015 的 ΔAUC 和 ΔR,以及加速医学伙伴关系-阿尔茨海默病(AMP-AD)的 0.060 和 0.086。在 UK Biobank 上,假设发现集和测试集大小相似的高血压 rPRS 表现为 0.0036±0.0027(ΔAUC)和 0.0032±0.0028(ΔR)。对于身高,ΔR 为 0.029±0.0037。

结论

考虑到 UK Biobank 中高血压和身高的高遗传度以及 UK Biobank 的样本量,AD 数据库的 sPRS 结果被夸大了。发现集和测试集之间的独立性是 PRS 研究的一个众所周知的基本要求。然而,由于在使用汇总统计数据时无法进行直接比较,因此许多 PRS 研究无法遵循这些要求。因此,对于 sPRS,应该在同一族群中仔细考虑潜在的重复。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7955/10500750/2e500b76fcee/12863_2023_1151_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验