双变量全基因组关联研究生物标志物检测的稳定性

Stability of bivariate GWAS biomarker detection.

作者信息

Bedő Justin, Rawlinson David, Goudey Benjamin, Ong Cheng Soon

机构信息

NICTA Victoria Research Laboratory, University of Melbourne, Victoria, Australia; Department of Computing and Information Systems, University of Melbourne, Victoria, Australia.

NICTA Victoria Research Laboratory, University of Melbourne, Victoria, Australia; Department of Electrical & Electronic Engineering, University of Melbourne, Victoria, Australia.

出版信息

PLoS One. 2014 Apr 30;9(4):e93319. doi: 10.1371/journal.pone.0093319. eCollection 2014.

DOI:10.1371/journal.pone.0093319

PMID:24787002

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4005767/

Abstract

Given the difficulty and effort required to confirm candidate causal SNPs detected in genome-wide association studies (GWAS), there is no practical way to definitively filter false positives. Recent advances in algorithmics and statistics have enabled repeated exhaustive search for bivariate features in a practical amount of time using standard computational resources, allowing us to use cross-validation to evaluate the stability. We performed 10 trials of 2-fold cross-validation of exhaustive bivariate analysis on seven Wellcome-Trust Case-Control Consortium GWAS datasets, comparing the traditional [Formula: see text] test for association, the high-performance GBOOST method and the recently proposed GSS statistic (Available at http://bioinformatics.research.nicta.com.au/software/gwis/). We use Spearman's correlation to measure the similarity between the folds of cross validation. To compare incomplete lists of ranks we propose an extension to Spearman's correlation. The extension allows us to consider a natural threshold for feature selection where the correlation is zero. This is the first reported cross-validation study of exhaustive bivariate GWAS feature selection. We found that stability between ranked lists from different cross-validation folds was higher for GSS in the majority of diseases. A thorough analysis of the correlation between SNP-frequency and univariate [Formula: see text] score demonstrated that the [Formula: see text] test for association is highly confounded by main effects: SNPs with high univariate significance replicably dominate the ranked results. We show that removal of the univariately significant SNPs improves [Formula: see text] replicability but risks filtering pairs involving SNPs with univariate effects. We empirically confirm that the stability of GSS and GBOOST were not affected by removal of univariately significant SNPs. These results suggest that the GSS and GBOOST tests are successfully targeting bivariate association with phenotype and that GSS is able to reliably detect a larger set of SNP-pairs than GBOOST in the majority of the data we analysed. However, the [Formula: see text] test for association was confounded by main effects.

摘要

鉴于在全基因组关联研究（GWAS）中确认候选因果单核苷酸多态性（SNP）所需的难度和工作量，目前没有切实可行的方法来明确过滤假阳性结果。算法和统计学方面的最新进展使得能够在标准计算资源下，在实际可用时间内对双变量特征进行反复穷举搜索，从而让我们能够使用交叉验证来评估稳定性。我们对七个威康信托病例对照协会GWAS数据集进行了2倍交叉验证的穷举双变量分析的10次试验，比较了传统的关联[公式：见原文]检验、高性能的GBOOST方法和最近提出的GSS统计量（可在http://bioinformatics.research.nicta.com.au/software/gwis/获取）。我们使用斯皮尔曼相关性来衡量交叉验证各折之间的相似性。为了比较不完整的排名列表，我们提出了斯皮尔曼相关性的扩展。该扩展使我们能够考虑特征选择的自然阈值，即相关性为零的情况。这是首次报道的关于穷举双变量GWAS特征选择的交叉验证研究。我们发现，在大多数疾病中，GSS在不同交叉验证折的排名列表之间的稳定性更高。对SNP频率与单变量[公式：见原文]得分之间相关性的深入分析表明，关联的[公式：见原文]检验受到主效应的高度混淆：具有高单变量显著性的SNP可重复性地主导排名结果。我们表明，去除单变量显著的SNP可提高[公式：见原文]的可重复性，但存在过滤涉及具有单变量效应SNP的配对的风险。我们通过实证证实，去除单变量显著的SNP不会影响GSS和GBOOST的稳定性。这些结果表明，GSS和GBOOST检验成功地针对了与表型的双变量关联，并且在我们分析的大多数数据中，GSS能够比GBOOST可靠地检测到更大的SNP对集合。然而，关联的[公式：见原文]检验受到主效应的混淆。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f345/4005767/ed1ad0a2ba69/pone.0093319.g001.jpg

相似文献

Stability of bivariate GWAS biomarker detection.双变量全基因组关联研究生物标志物检测的稳定性

PLoS One. 2014 Apr 30;9(4):e93319. doi: 10.1371/journal.pone.0093319. eCollection 2014.

GWIS--model-free, fast and exhaustive search for epistatic interactions in case-control GWAS.GWIS--无模型、快速且全面搜索病例对照 GWAS 中的上位相互作用。

BMC Genomics. 2013;14 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-14-S3-S10. Epub 2013 May 28.

Performance of epistasis detection methods in semi-simulated GWAS.连锁不平衡检测方法在半模拟 GWAS 中的性能。

BMC Bioinformatics. 2018 Jun 18;19(1):231. doi: 10.1186/s12859-018-2229-8.

Genome-wide associations and detection of potential candidate genes for direct genetic and maternal genetic effects influencing dairy cattle body weight at different ages.全基因组关联分析及潜在候选基因检测对不同年龄奶牛体尺直接遗传和母体遗传效应的影响。

Genet Sel Evol. 2019 Feb 6;51(1):4. doi: 10.1186/s12711-018-0444-4.

GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.GBOOST：一种基于 GPU 的工具，用于在全基因组病例对照研究中检测基因-基因相互作用。

Bioinformatics. 2011 May 1;27(9):1309-10. doi: 10.1093/bioinformatics/btr114. Epub 2011 Mar 3.

Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data.从全基因组关联数据中找到 2 型糖尿病因果单核苷酸多态性组合和功能模块。

BMC Med Inform Decis Mak. 2013;13 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-13-S1-S3. Epub 2013 Apr 5.

JEPEG: a summary statistics based tool for gene-level joint testing of functional variants.JEPEG：一种基于汇总统计量的功能变异基因水平联合检测工具。

Bioinformatics. 2015 Apr 15;31(8):1176-82. doi: 10.1093/bioinformatics/btu816. Epub 2014 Dec 12.

Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.使用基于质量的两阶段随机森林进行全基因组关联数据分类和单核苷酸多态性选择。

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-16-S2-S5. Epub 2015 Jan 21.

Genome-wide association study combined with biological context can reveal more disease-related SNPs altering microRNA target seed sites.全基因组关联研究结合生物学背景能够揭示更多改变微小RNA靶标种子位点的疾病相关单核苷酸多态性。

BMC Genomics. 2014 Aug 8;15(1):669. doi: 10.1186/1471-2164-15-669.

A Markov blanket-based method for detecting causal SNPs in GWAS.基于马尔可夫毯的 GWAS 中因果 SNP 检测方法。

BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-11-S3-S5.

引用本文的文献

Functional Mapping of Phenotypic Plasticity of Under Vancomycin Pressure.万古霉素压力下的表型可塑性功能图谱

Front Microbiol. 2021 Sep 9;12:696730. doi: 10.3389/fmicb.2021.696730. eCollection 2021.

Collateral Sensitivity to β-Lactam Drugs in Drug-Resistant Tuberculosis Is Driven by the Transcriptional Wiring of BlaI Operon Genes.耐多药结核病中β-内酰胺类药物的交叉敏感性是由 blaI 操纵子基因的转录布线驱动的。

mSphere. 2021 Jun 30;6(3):e0024521. doi: 10.1128/mSphere.00245-21. Epub 2021 May 28.

Phenotypic Plasticity of in Liquid Medium Containing Vancomycin.在含有万古霉素的液体培养基中的表型可塑性

Front Microbiol. 2019 Apr 16;10:809. doi: 10.3389/fmicb.2019.00809. eCollection 2019.

Spatial correlations exploitation based on nonlocal voxel-wise GWAS for biomarker detection of AD.基于非局部体素 GWAS 的空间相关性挖掘在 AD 生物标志物检测中的应用。

Neuroimage Clin. 2019;21:101642. doi: 10.1016/j.nicl.2018.101642. Epub 2018 Dec 12.

Performance of epistasis detection methods in semi-simulated GWAS.连锁不平衡检测方法在半模拟 GWAS 中的性能。

BMC Bioinformatics. 2018 Jun 18;19(1):231. doi: 10.1186/s12859-018-2229-8.

本文引用的文献

GWIS--model-free, fast and exhaustive search for epistatic interactions in case-control GWAS.GWIS--无模型、快速且全面搜索病例对照 GWAS 中的上位相互作用。

BMC Genomics. 2013;14 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-14-S3-S10. Epub 2013 May 28.

GLIDE: GPU-based linear regression for detection of epistasis.GLIDE：基于GPU的用于检测上位性的线性回归

Hum Hered. 2012;73(4):220-36. doi: 10.1159/000341885. Epub 2012 Sep 4.

Ultrafast genome-wide scan for SNP-SNP interactions in common complex disease.用于常见复杂疾病中 SNP-SNP 相互作用的超快速全基因组扫描。

Genome Res. 2012 Nov;22(11):2230-40. doi: 10.1101/gr.137885.112. Epub 2012 Jul 5.

Stability of gene rankings from RNAi screens.RNAi 筛选中基因排名的稳定性。

Bioinformatics. 2012 Jun 15;28(12):1612-8. doi: 10.1093/bioinformatics/bts192. Epub 2012 Apr 17.

Introduction to genetic association studies.基因关联研究导论。

Cold Spring Harb Protoc. 2012 Mar 1;2012(3):297-306. doi: 10.1101/pdb.top068163.

Robust rank aggregation for gene list integration and meta-analysis.稳健的基因列表整合和荟萃分析排名聚合。

Bioinformatics. 2012 Feb 15;28(4):573-80. doi: 10.1093/bioinformatics/btr709. Epub 2012 Jan 12.

Five years of GWAS discovery.GWAS 发现的五年。

Am J Hum Genet. 2012 Jan 13;90(1):7-24. doi: 10.1016/j.ajhg.2011.11.029.

The mystery of missing heritability: Genetic interactions create phantom heritability.遗传力缺失之谜：基因相互作用产生了幽灵遗传力。

Proc Natl Acad Sci U S A. 2012 Jan 24;109(4):1193-8. doi: 10.1073/pnas.1119675109. Epub 2012 Jan 5.

Stability selection for genome-wide association.全基因组关联的稳定性选择。

Genet Epidemiol. 2011 Nov;35(7):722-8. doi: 10.1002/gepi.20623. Epub 2011 Aug 26.

Bioinformatics. 2011 May 1;27(9):1309-10. doi: 10.1093/bioinformatics/btr114. Epub 2011 Mar 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

双变量全基因组关联研究生物标志物检测的稳定性

Stability of bivariate GWAS biomarker detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献