INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany.
Int J Mol Sci. 2023 Sep 12;24(18):14011. doi: 10.3390/ijms241814011.
Estimating the FDR significance threshold in genome-wide association studies remains a major challenge in distinguishing true positive hypotheses from false positive and negative errors. Several comparative methods for multiple testing comparison have been developed to determine the significance threshold; however, these methods may be overly conservative and lead to an increase in false negative results. The local FDR approach is suitable for testing many associations simultaneously based on the empirical Bayes perspective. In the local FDR, the maximum likelihood estimator is sensitive to bias when the GWAS model contains two or more explanatory variables as genetic parameters simultaneously. The main criticism of local FDR is that it focuses only locally on the effects of single nucleotide polymorphism (SNP) in tails of distribution, whereas the signal associations are distributed across the whole genome. The advantage of the Bayesian perspective is that knowledge of prior distribution comes from other genetic parameters included in the GWAS model, such as linkage disequilibrium (LD) analysis, minor allele frequency (MAF) and call rate of significant associations. We also proposed Bayesian survival FDR to solve the multi-collinearity and large-scale problems, respectively, in grain yield (GY) vector in bread wheat with large-scale SNP information. The objective of this study was to obtain a short list of SNPs that are reliably associated with GY under low and high levels of nitrogen (N) in the population. The five top significant SNPs were compared with different Bayesian models. Based on the time to events in the Bayesian survival analysis, the differentiation between minor and major alleles within the association panel can be identified.
在全基因组关联研究中,估计 FDR 显著性阈值仍然是区分真正阳性假设与假阳性和阴性错误的主要挑战。已经开发了几种用于多重测试比较的比较方法来确定显著性阈值;然而,这些方法可能过于保守,导致假阴性结果增加。局部 FDR 方法适用于根据经验贝叶斯观点同时测试许多关联。在局部 FDR 中,当 GWAS 模型同时包含两个或更多解释变量作为遗传参数时,最大似然估计对偏差很敏感。局部 FDR 的主要批评是,它仅在分布尾部的 SNP 效应上局部关注,而信号关联则分布在整个基因组中。贝叶斯观点的优势在于,先验分布的知识来自 GWAS 模型中包含的其他遗传参数,如连锁不平衡 (LD) 分析、次要等位基因频率 (MAF) 和显著关联的调用率。我们还提出了贝叶斯生存 FDR,分别解决了在含有大规模 SNP 信息的面包小麦产量 (GY) 向量中存在的多共线性和大规模问题。本研究的目的是在群体中氮 (N) 水平低和高的情况下,获得与 GY 可靠相关的 SNP 短名单。将前 5 个显著 SNP 与不同的贝叶斯模型进行比较。基于贝叶斯生存分析中的事件时间,可以识别关联面板中次要和主要等位基因之间的差异。