Suppr超能文献

自然 GWAS:一个用于使用经验数据评估全基因组关联方法的 R 包。

Naturalgwas: An R package for evaluating genomewide association methods with empirical data.

机构信息

Université Grenoble-Alpes, Grenoble, France.

Grenoble Institute of Technology, Grenoble, France.

出版信息

Mol Ecol Resour. 2018 Jul;18(4):789-797. doi: 10.1111/1755-0998.12892. Epub 2018 May 7.

Abstract

Association studies of polygenic traits are notoriously difficult when those studies are conducted at large geographic scales. The difficulty arises as genotype frequencies often vary in geographic space and across distinct environments. Those large-scale variations are known to yield false positives in standard association testing approaches. Although several methods alleviate this problem, no tools have been proposed to evaluate the power that association tests could achieve for a specific study design and set of genotypes. Our goal here is to present an R program fulfilling this objective, by allowing users to simulate phenotypes from observed genotypes and to estimate upper bounds on achievable power. The simulation model can incorporate realistic features such as population structure and gene-by-environment interactions, and the package implements a gold-standard test that evaluates power using information on confounders. We illustrated the use of the program with example studies based on data for the plant species Arabidopsis thaliana. Simulated phenotypes were used to compare the ability of two recent association methods to correctly remove confounding factors, to evaluate power to detect causal variants, and to assess the influence various parameters. For the simulated data, the new tests reached performances close to the gold-standard test and could be reasonably used with measured phenotypes. Power to detect causal variants was influenced by the number of variants and by the strength of their effect sizes, and specific thresholds were obtained from the simulation study. In conclusion, our program provides guidance on methodological choice of association tests, as well as useful knowledge on test performances in a user-specific context.

摘要

当这些研究在大地理尺度上进行时,多基因性状的关联研究通常非常困难。这种困难源于基因型频率在地理空间和不同环境中经常发生变化。这些大规模的变化已知会在标准关联测试方法中产生假阳性。尽管有几种方法可以缓解这个问题,但还没有提出工具来评估关联测试对于特定研究设计和基因型集可以实现的功效。我们的目标是通过允许用户从观察到的基因型模拟表型并估计特定研究设计和基因型集的关联测试的可达功效上限,来展示一个满足这一目标的 R 程序。模拟模型可以纳入真实的特征,如人口结构和基因-环境相互作用,并且该软件包实现了一种黄金标准测试,该测试使用混杂因素的信息来评估功效。我们使用基于拟南芥植物物种数据的示例研究来说明该程序的使用。模拟表型用于比较两种最近的关联方法正确去除混杂因素的能力,评估检测因果变异的功效,并评估各种参数的影响。对于模拟数据,新的测试达到了接近黄金标准测试的性能,并且可以合理地用于测量表型。检测因果变异的功效受到变异数量和其效应大小的强度的影响,并且可以从模拟研究中获得特定的阈值。总之,我们的程序为关联测试的方法选择提供了指导,以及在用户特定背景下测试性能的有用知识。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验