Suppr超能文献

在后 GWAS 时代重新对测序变异进行排序,以准确识别因果变异。

Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification.

机构信息

Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.

出版信息

PLoS Genet. 2013;9(8):e1003609. doi: 10.1371/journal.pgen.1003609. Epub 2013 Aug 8.

Abstract

Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website.

摘要

下一代测序技术通过提供碱基对水平的信息,大大提高了我们定位致病变异的能力,成本也越来越可行,足以满足检测复杂性状关联所需的大样本量。然而,在已建立的关联区域内识别因果变异仍然是一个挑战。与直觉相反的是,某些增加检测相关区域能力的因素反而会降低定位因果变异的能力。首先,将 GWAS 与 imputation 或低覆盖测序相结合,以达到高功率所需的大样本量,可能会产生意想不到的 SNP 基因分型错误。这往往会使关联的相对证据偏向于更好基因分型的 SNP。其次,重新使用 GWAS 数据进行精细映射利用了以前的发现,以确保 GWAS 关联区域的全基因组显著性。然而,使用 GWAS 结果来告知精细映射分析可能会使证据偏离因果 SNP,偏向于标签 SNP 和与标签 SNP 高度连锁的 SNP。这些因素加在一起,可能会使定位因果 SNP 的能力降低一半以上。为了增加检测关联的能力而采用的其他策略,例如增加样本量和使用更高密度的基因分型阵列,在某些常见情况下实际上会加剧这些效应,并进一步降低定位因果变异的能力。我们开发了一种重新排序程序,可以解决这些不利影响,大大提高因果 SNP 识别的准确性,通常使因果 SNP 排名靠前的概率增加一倍。在 NCI BPC3 侵袭性前列腺癌 GWAS 与 imputation 元分析的应用中,在 3 个相关位点中的 2 个确定了一个新的顶级 SNP,并在这些位点确定了几个额外的可能的因果 SNP,否则这些 SNP 可能会被忽略。这种方法使用作者网站上提供的 R 脚本很容易实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f03/3738448/2816aa2a6cb2/pgen.1003609.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验