用于多效性单核苷酸多态性关联的多变量建模和遗传风险预测的贝叶斯方法。

Bayesian methods for multivariate modeling of pleiotropic SNP associations and genetic risk prediction.

作者信息

Hartley Stephen W, Monti Stefano, Liu Ching-Ti, Steinberg Martin H, Sebastiani Paola

机构信息

Department of Biostatistics, Boston University School of Public Health Boston, MA, USA.

出版信息

Front Genet. 2012 Sep 11;3:176. doi: 10.3389/fgene.2012.00176. eCollection 2012.

DOI:10.3389/fgene.2012.00176

PMID:22973300

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3438684/

Abstract

Genome-wide association studies (GWAS) have identified numerous associations between genetic loci and individual phenotypes; however, relatively few GWAS have attempted to detect pleiotropic associations, in which loci are simultaneously associated with multiple distinct phenotypes. We show that pleiotropic associations can be directly modeled via the construction of simple Bayesian networks, and that these models can be applied to produce single or ensembles of Bayesian classifiers that leverage pleiotropy to improve genetic risk prediction. The proposed method includes two phases: (1) Bayesian model comparison, to identify Single-Nucleotide Polymorphisms (SNPs) associated with one or more traits; and (2) cross-validation feature selection, in which a final set of SNPs is selected to optimize prediction. To demonstrate the capabilities and limitations of the method, a total of 1600 case-control GWAS datasets with two dichotomous phenotypes were simulated under 16 scenarios, varying the association strengths of causal SNPs, the size of the discovery sets, the balance between cases and controls, and the number of pleiotropic causal SNPs. Across the 16 scenarios, prediction accuracy varied from 90 to 50%. In the 14 scenarios that included pleiotropically associated SNPs, the pleiotropic model search and prediction methods consistently outperformed the naive model search and prediction. In the two scenarios in which there were no true pleiotropic SNPs, the differences between the pleiotropic and naive model searches were minimal. To further evaluate the method on real data, a discovery set of 1071 sickle cell disease (SCD) patients was used to search for pleiotropic associations between cerebral vascular accidents and fetal hemoglobin level. Classification was performed on a smaller validation set of 352 SCD patients, and showed that the inclusion of pleiotropic SNPs may slightly improve prediction, although the difference was not statistically significant. The proposed method is robust, computationally efficient, and provides a powerful new approach for detecting and modeling pleiotropic disease loci.

摘要

全基因组关联研究（GWAS）已经确定了众多基因位点与个体表型之间的关联；然而，相对较少的GWAS尝试检测多效性关联，即基因位点同时与多种不同表型相关联。我们表明，多效性关联可以通过构建简单的贝叶斯网络直接建模，并且这些模型可以用于生成利用多效性来改善遗传风险预测的单个或集成贝叶斯分类器。所提出的方法包括两个阶段：（1）贝叶斯模型比较，以识别与一个或多个性状相关的单核苷酸多态性（SNP）；（2）交叉验证特征选择，其中选择一组最终的SNP以优化预测。为了证明该方法的能力和局限性，在16种情况下模拟了总共1600个具有两种二分表型的病例对照GWAS数据集，改变了因果SNP的关联强度、发现集的大小、病例与对照之间的平衡以及多效性因果SNP的数量。在这16种情况下，预测准确率从90%到50%不等。在包括多效性相关SNP的14种情况下，多效性模型搜索和预测方法始终优于朴素模型搜索和预测。在没有真正多效性SNP的两种情况下，多效性和朴素模型搜索之间的差异最小。为了在真实数据上进一步评估该方法，使用了1071例镰状细胞病（SCD）患者的发现集来搜索脑血管意外与胎儿血红蛋白水平之间的多效性关联。对352例SCD患者的较小验证集进行分类，结果表明纳入多效性SNP可能会略微改善预测，尽管差异没有统计学意义。所提出的方法稳健、计算效率高，并为检测和建模多效性疾病基因位点提供了一种强大的新方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7e4/3438684/37c01737fd78/fgene-03-00176-g001.jpg

相似文献

Bayesian methods for multivariate modeling of pleiotropic SNP associations and genetic risk prediction.用于多效性单核苷酸多态性关联的多变量建模和遗传风险预测的贝叶斯方法。

Front Genet. 2012 Sep 11;3:176. doi: 10.3389/fgene.2012.00176. eCollection 2012.

A multi-trait Bayesian method for mapping QTL and genomic prediction.一种用于 QTL 作图和基因组预测的多性状贝叶斯方法。

Genet Sel Evol. 2018 Mar 24;50(1):10. doi: 10.1186/s12711-018-0377-y.

A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.一种适应共享对照设计的多效性信息贝叶斯错误发现率可从全基因组关联研究汇总统计数据中发现新的疾病关联。

PLoS Genet. 2015 Feb 6;11(2):e1004926. doi: 10.1371/journal.pgen.1004926. eCollection 2015 Feb.

PleioGRiP: genetic risk prediction with pleiotropy.PleioGRiP：具有多效性的遗传风险预测。

Bioinformatics. 2013 Apr 15;29(8):1086-8. doi: 10.1093/bioinformatics/btt081. Epub 2013 Feb 17.

An efficient unified model for genome-wide association studies and genomic selection.一种用于全基因组关联研究和基因组选择的高效统一模型。

Genet Sel Evol. 2017 Aug 24;49(1):64. doi: 10.1186/s12711-017-0338-x.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强？

Pac Symp Biocomput. 2018;23:228-239.

Identification of novel SNPs associated with coronary artery disease and birth weight using a pleiotropic cFDR method.利用多效性 cFDR 方法鉴定与冠心病和出生体重相关的新型 SNPs。

Aging (Albany NY). 2020 Dec 19;13(3):3618-3644. doi: 10.18632/aging.202322.

Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.使用基于质量的两阶段随机森林进行全基因组关联数据分类和单核苷酸多态性选择。

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-16-S2-S5. Epub 2015 Jan 21.

Use of gene expression and whole-genome sequence information to improve the accuracy of genomic prediction for carcass traits in Hanwoo cattle.利用基因表达和全基因组序列信息提高韩牛胴体性状基因组预测的准确性。

Genet Sel Evol. 2020 Sep 29;52(1):54. doi: 10.1186/s12711-020-00574-2.

Pleiotropic mapping and annotation selection in genome-wide association studies with penalized Gaussian mixture models.基于惩罚高斯混合模型的全基因组关联研究中的多效性映射和注释选择。

Bioinformatics. 2018 Aug 15;34(16):2797-2807. doi: 10.1093/bioinformatics/bty204.

引用本文的文献

An Integrated Framework for Analysis and Prediction of Impact of Single Nucleotide Polymorphism Associated with Human Diseases.一种用于分析和预测与人类疾病相关的单核苷酸多态性影响的综合框架。

Evol Bioinform Online. 2024 May 10;20:11769343241249916. doi: 10.1177/11769343241249916. eCollection 2024.

Multivariate Analysis and Modelling of multiple Brain endOphenotypes: Let's MAMBO!多种脑内表型的多变量分析与建模：让我们行动起来！

Comput Struct Biotechnol J. 2021 Oct 13;19:5800-5810. doi: 10.1016/j.csbj.2021.10.019. eCollection 2021.

Extent and context dependence of pleiotropy revealed by high-throughput single-cell phenotyping.高通量单细胞表型分析揭示的表型多效性的程度和上下文依赖性。

PLoS Biol. 2020 Aug 17;18(8):e3000836. doi: 10.1371/journal.pbio.3000836. eCollection 2020 Aug.

Genomic Prediction of 16 Complex Disease Risks Including Heart Attack, Diabetes, Breast and Prostate Cancer.16 种复杂疾病风险的基因组预测，包括心脏病发作、糖尿病、乳腺癌和前列腺癌。

Sci Rep. 2019 Oct 25;9(1):15286. doi: 10.1038/s41598-019-51258-x.

A regression framework to uncover pleiotropy in large-scale electronic health record data.一种在大规模电子健康记录数据中揭示多效性的回归框架。

J Am Med Inform Assoc. 2019 Oct 1;26(10):1083-1090. doi: 10.1093/jamia/ocz084.

Statistical methods to detect pleiotropy in human complex traits.用于检测人类复杂特征中存在的多效性的统计方法。

Open Biol. 2017 Nov;7(11). doi: 10.1098/rsob.170125.

SCOPA and META-SCOPA: software for the analysis and aggregation of genome-wide association studies of multiple correlated phenotypes.SCOPA和META-SCOPA：用于分析和汇总多个相关表型的全基因组关联研究的软件。

BMC Bioinformatics. 2017 Jan 11;18(1):25. doi: 10.1186/s12859-016-1437-3.

The genetics of bone mass and susceptibility to bone diseases.骨量和易患骨骼疾病的遗传学。

Nat Rev Rheumatol. 2016 Jun;12(6):323-34. doi: 10.1038/nrrheum.2016.48. Epub 2016 Apr 7.

Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.用于定位多种表型的半参数等位基因检验：二项式回归和马氏距离

Genet Epidemiol. 2015 Dec;39(8):635-50. doi: 10.1002/gepi.21930. Epub 2015 Oct 23.

Evaluation of an ensemble of genetic models for prediction of a quantitative trait.用于预测数量性状的遗传模型集成评估。

Front Genet. 2015 Jan 13;5:474. doi: 10.3389/fgene.2014.00474. eCollection 2014.

本文引用的文献

A genome-wide association study of total bilirubin and cholelithiasis risk in sickle cell anemia.全基因组关联研究总胆红素与镰状细胞贫血胆石症风险的关系。

PLoS One. 2012;7(4):e34741. doi: 10.1371/journal.pone.0034741. Epub 2012 Apr 27.

Naïve Bayesian Classifier and Genetic Risk Score for Genetic Risk Prediction of a Categorical Trait: Not so Different after all!用于分类性状遗传风险预测的朴素贝叶斯分类器和遗传风险评分：终究并非如此不同！

Front Genet. 2012 Feb 29;3:26. doi: 10.3389/fgene.2012.00026. eCollection 2012.

Moving toward System Genetics through Multiple Trait Analysis in Genome-Wide Association Studies.通过全基因组关联研究中的多性状分析迈向系统遗传学

Front Genet. 2012 Jan 16;3:1. doi: 10.3389/fgene.2012.00001. eCollection 2012.

Genetic signatures of exceptional longevity in humans.人类超长寿命的遗传特征。

PLoS One. 2012;7(1):e29848. doi: 10.1371/journal.pone.0029848. Epub 2012 Jan 18.

A bayesian method for evaluating and discovering disease loci associations.贝叶斯方法评估和发现疾病相关基因座。

PLoS One. 2011;6(8):e22075. doi: 10.1371/journal.pone.0022075. Epub 2011 Aug 10.

Identification of homogeneous genetic architecture of multiple genetically correlated traits by block clustering of genome-wide associations.基于全基因组关联的分块聚类鉴定多个遗传相关性状的同质性遗传结构。

J Bone Miner Res. 2011 Jun;26(6):1261-71. doi: 10.1002/jbmr.333.

pROC: an open-source package for R and S+ to analyze and compare ROC curves.pROC：一个用于 R 和 S+的开源软件包，用于分析和比较 ROC 曲线。

BMC Bioinformatics. 2011 Mar 17;12:77. doi: 10.1186/1471-2105-12-77.

PRIMe: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies.PRIMe：一种从多个全基因组关联研究中进行多效区域特征描述和评估的方法。

Bioinformatics. 2011 May 1;27(9):1201-6. doi: 10.1093/bioinformatics/btr116. Epub 2011 Mar 12.

Meta-analysis of genome-wide association studies in celiac disease and rheumatoid arthritis identifies fourteen non-HLA shared loci.乳糜泻和类风湿关节炎全基因组关联研究的荟萃分析确定了 14 个非 HLA 共享位点。

PLoS Genet. 2011 Feb;7(2):e1002004. doi: 10.1371/journal.pgen.1002004. Epub 2011 Feb 24.

Blood pressure and cerebral white matter share common genetic factors in Mexican Americans.美籍墨西哥人群体中，血压和脑白质共享共同的遗传因素。

Hypertension. 2011 Feb;57(2):330-5. doi: 10.1161/HYPERTENSIONAHA.110.162206. Epub 2010 Dec 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于多效性单核苷酸多态性关联的多变量建模和遗传风险预测的贝叶斯方法。

Bayesian methods for multivariate modeling of pleiotropic SNP associations and genetic risk prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献