• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PhenoSpD:一个整合的工具包,用于使用 GWAS 汇总统计数据进行表型相关性估计和多重检验校正。

PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics.

机构信息

MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, BS8 2BN, UK.

Intelligent Systems Laboratory, University of Bristol, Tyndall Ave, Bristol, BS8 1TH, UK.

出版信息

Gigascience. 2018 Aug 1;7(8):giy090. doi: 10.1093/gigascience/giy090.

DOI:10.1093/gigascience/giy090
PMID:30165448
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6109640/
Abstract

BACKGROUND

Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results.

RESULTS

Here, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites.

CONCLUSIONS

PhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data.

摘要

背景

识别复杂性状和疾病之间的表型相关性可以提供有用的病因学见解。由于个体水平表型数据的获取受到限制,因此很难估计人类表型范围内的大规模表型相关性。两种最先进的方法,metaCCA 和 LD 分数回归,提供了一种替代方法,仅使用全基因组关联研究 (GWAS) 汇总结果来估计表型相关性。

结果

在这里,我们提出了一个集成的 R 工具包 PhenoSpD,用于使用 LD 分数回归来估计使用 GWAS 汇总统计数据的表型相关性,并利用估计的表型相关性来利用矩阵的谱分解 (SpD) 信息来校正复杂人类性状的多重检验。模拟表明,使用部分重叠的样本可以识别表型的非独立性;随着重叠程度的降低,估计的表型相关性将趋于零,多重检验校正将比在完全重叠的样本中更为严格。此外,与 LD 分数回归不同,metaCCA 将提供近似的遗传相关性,而不是表型相关性,这限制了其在多重检验校正中的应用。在一个案例研究中,使用英国生物库 GWAS 结果的 PhenoSpD 表明在 487 个人类特征中有 399.6 个独立的测试,这与使用真实表型相关性估计的 352.4 个独立测试非常接近。我们进一步应用 PhenoSpD 对 Kettunen 发表的 GWAS 汇总统计数据中 107 种代谢物之间的 5618 对两两表型相关性进行了分析,PhenoSpD 表明这些代谢物的等效独立测试为 33.5 个。

结论

PhenoSpD 扩展了汇总结果的使用,为使用 GWAS 汇总统计数据对复杂人类性状进行降维提供了一种简单而保守的方法。这在大型生物库和联盟研究时代尤其有价值,在这个时代,GWAS 结果比个体水平的数据更容易获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/c5761d53e11b/giy090fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/a6e31bde33fa/giy090fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/dbf1ca457d9c/giy090fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/60689331ae16/giy090fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/c5761d53e11b/giy090fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/a6e31bde33fa/giy090fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/dbf1ca457d9c/giy090fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/60689331ae16/giy090fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/25a8/6109640/c5761d53e11b/giy090fig4.jpg

相似文献

1
PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics.PhenoSpD:一个整合的工具包,用于使用 GWAS 汇总统计数据进行表型相关性估计和多重检验校正。
Gigascience. 2018 Aug 1;7(8):giy090. doi: 10.1093/gigascience/giy090.
2
LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis.LD Hub:一个集中式数据库和网络界面,用于执行连锁不平衡(LD)评分回归,最大限度地发挥汇总水平全基因组关联研究(GWAS)数据在单核苷酸多态性(SNP)遗传力和遗传相关性分析方面的潜力。
Bioinformatics. 2017 Jan 15;33(2):272-279. doi: 10.1093/bioinformatics/btw613. Epub 2016 Sep 22.
3
Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics.利用 GWAS 汇总统计数据估计复杂性状遗传相关性的方法比较。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa442.
4
High-definition likelihood inference of genetic correlations across human complex traits.人类复杂特征遗传相关性的高分辨率似然推断。
Nat Genet. 2020 Aug;52(8):859-864. doi: 10.1038/s41588-020-0653-y. Epub 2020 Jun 29.
5
An atlas of genetic correlations across human diseases and traits.人类疾病与性状的遗传相关性图谱。
Nat Genet. 2015 Nov;47(11):1236-41. doi: 10.1038/ng.3406. Epub 2015 Sep 28.
6
Control for population stratification in genetic association studies based on GWAS summary statistics.基于 GWAS 汇总统计数据的遗传关联研究中的群体分层控制。
Genet Epidemiol. 2022 Dec;46(8):604-614. doi: 10.1002/gepi.22493. Epub 2022 Jun 29.
7
PRED-LD: efficient imputation of GWAS summary statistics.PRED-LD:全基因组关联研究汇总统计数据的高效估算
BMC Bioinformatics. 2025 Apr 16;26(1):107. doi: 10.1186/s12859-025-06119-y.
8
LDER-GE estimates phenotypic variance component of gene-environment interactions in human complex traits accurately with GE interaction summary statistics and full LD information.LDER-GE 利用基因-环境交互作用的综合统计数据和完全连锁不平衡信息,准确估计人类复杂性状的表型方差分量。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae335.
9
GAUSS: a summary-statistics-based R package for accurate estimation of linkage disequilibrium for variants, Gaussian imputation, and TWAS analysis of cosmopolitan cohorts.GAUSS:一个基于汇总统计的 R 包,用于准确估计变体的连锁不平衡、高斯插补以及世界性队列的 TWAS 分析。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae203.
10
Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits.局部遗传相关性有助于深入了解复杂性状的共享遗传结构。
Am J Hum Genet. 2017 Nov 2;101(5):737-751. doi: 10.1016/j.ajhg.2017.09.022.

引用本文的文献

1
The Effect of Circulating Proteins and Their Role in Mediating Adiposity's Effect on Endometrial Cancer Risk: Mendelian Randomization and Colocalization Analyses.循环蛋白的作用及其在介导肥胖对子宫内膜癌风险影响中的作用:孟德尔随机化和共定位分析。
Cancer Epidemiol Biomarkers Prev. 2025 Sep 2;34(9):1534-1543. doi: 10.1158/1055-9965.EPI-25-0165.
2
Sex-specific Mendelian randomization phenome-wide association study of basal metabolic rate.基础代谢率的性别特异性孟德尔随机化全表型关联研究
Sci Rep. 2025 Apr 24;15(1):14368. doi: 10.1038/s41598-025-98017-9.
3
Mendelian randomization of immune cell phenotypes to discover potential drug targets for B-cell malignancy.

本文引用的文献

1
Impact of common genetic determinants of Hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: A transethnic genome-wide meta-analysis.糖化血红蛋白常见遗传决定因素对不同种族人群2型糖尿病风险及诊断的影响:一项跨种族全基因组荟萃分析。
PLoS Med. 2017 Sep 12;14(9):e1002383. doi: 10.1371/journal.pmed.1002383. eCollection 2017 Sep.
2
Connecting genetic risk to disease end points through the human blood plasma proteome.通过人类血浆蛋白质组将遗传风险与疾病终点联系起来。
Nat Commun. 2017 Feb 27;8:14357. doi: 10.1038/ncomms14357.
3
Dissecting the genetics of complex traits using summary association statistics.
免疫细胞表型的孟德尔随机化研究,以发现B细胞恶性肿瘤的潜在药物靶点。
Blood Cancer J. 2025 Apr 9;15(1):62. doi: 10.1038/s41408-025-01277-x.
4
Causal Effects of 25-Hydroxyvitamin D on Metabolic Syndrome and Metabolic Risk Traits: A Bidirectional Two-Sample Mendelian Randomization Study.25-羟基维生素D对代谢综合征和代谢风险特征的因果效应:一项双向两样本孟德尔随机化研究
Biomedicines. 2025 Mar 15;13(3):723. doi: 10.3390/biomedicines13030723.
5
A proteogenomic analysis of the adiposity colorectal cancer relationship identifies GREM1 as a probable mediator.一项关于肥胖与结直肠癌关系的蛋白质基因组学分析确定GREM1为可能的介导因子。
Int J Epidemiol. 2024 Dec 16;54(1). doi: 10.1093/ije/dyae175.
6
Potential Causal Association Between Atrial Fibrillation/Flutter and Primary Open-Angle Glaucoma: A Two-Sample Mendelian Randomisation Study.心房颤动/扑动与原发性开角型青光眼之间的潜在因果关联:一项两样本孟德尔随机化研究。
J Clin Med. 2024 Dec 16;13(24):7670. doi: 10.3390/jcm13247670.
7
The goldmine of GWAS summary statistics: a systematic review of methods and tools.全基因组关联研究汇总统计数据的宝库:方法与工具的系统综述
BioData Min. 2024 Sep 5;17(1):31. doi: 10.1186/s13040-024-00385-x.
8
Multitrait Genetic Analysis Identifies Novel Pleiotropic Loci for Depression and Schizophrenia in East Asians.多性状遗传分析确定了东亚人群中抑郁症和精神分裂症新的多效性基因座。
Schizophr Bull. 2025 May 8;51(3):684-695. doi: 10.1093/schbul/sbae145.
9
Disentangling heterogeneity in substance use disorder: Insights from genome-wide polygenic scores.解开物质使用障碍的异质性:全基因组多基因评分的见解。
Transl Psychiatry. 2024 May 29;14(1):221. doi: 10.1038/s41398-024-02923-x.
10
Insights into the genetic architecture of cerebellar lobules derived from the UK Biobank.来自英国生物银行的对小脑小叶遗传结构的见解。
Sci Rep. 2024 Apr 25;14(1):9488. doi: 10.1038/s41598-024-59699-9.
利用汇总关联统计剖析复杂性状的遗传学。
Nat Rev Genet. 2017 Feb;18(2):117-127. doi: 10.1038/nrg.2016.142. Epub 2016 Nov 14.
4
LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis.LD Hub:一个集中式数据库和网络界面,用于执行连锁不平衡(LD)评分回归,最大限度地发挥汇总水平全基因组关联研究(GWAS)数据在单核苷酸多态性(SNP)遗传力和遗传相关性分析方面的潜力。
Bioinformatics. 2017 Jan 15;33(2):272-279. doi: 10.1093/bioinformatics/btw613. Epub 2016 Sep 22.
5
metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.metaCCA:基于全基因组关联研究汇总统计量,运用典型相关分析的多变量荟萃分析。
Bioinformatics. 2016 Jul 1;32(13):1981-9. doi: 10.1093/bioinformatics/btw052. Epub 2016 Feb 19.
6
Systematic identification of genetic influences on methylation across the human life course.对人类生命历程中甲基化的遗传影响进行系统鉴定。
Genome Biol. 2016 Mar 31;17:61. doi: 10.1186/s13059-016-0926-z.
7
Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA.全基因组循环代谢物研究鉴定出62个基因座并揭示溶血磷脂酸的新系统效应。
Nat Commun. 2016 Mar 23;7:11122. doi: 10.1038/ncomms11122.
8
MR-PheWAS: hypothesis prioritization among potential causal effects of body mass index on many outcomes, using Mendelian randomization.孟德尔随机化多结果身体质量指数潜在因果效应的MR-PheWAS:假设优先级排序
Sci Rep. 2015 Nov 16;5:16645. doi: 10.1038/srep16645.
9
An atlas of genetic correlations across human diseases and traits.人类疾病与性状的遗传相关性图谱。
Nat Genet. 2015 Nov;47(11):1236-41. doi: 10.1038/ng.3406. Epub 2015 Sep 28.
10
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age.英国生物银行:一个用于识别多种中老年复杂疾病病因的开放获取资源。
PLoS Med. 2015 Mar 31;12(3):e1001779. doi: 10.1371/journal.pmed.1001779. eCollection 2015 Mar.