Suppr超能文献

快速效应大小收缩软件,用于等位基因失衡的贝塔二项式模型。

Fast effect size shrinkage software for beta-binomial models of allelic imbalance.

机构信息

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA.

Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27514, USA.

出版信息

F1000Res. 2019 Nov 28;8:2024. doi: 10.12688/f1000research.20916.2. eCollection 2019.

Abstract

Allelic imbalance occurs when the two alleles of a gene are differentially expressed within a diploid organism and can indicate important differences in cis-regulation and epigenetic state across the two chromosomes. Because of this, the ability to accurately quantify the proportion at which each allele of a gene is expressed is of great interest to researchers. This becomes challenging in the presence of small read counts and/or sample sizes, which can cause estimators for allelic expression proportions to have high variance. Investigators have traditionally dealt with this problem by filtering out genes with small counts and samples. However, this may inadvertently remove important genes that have truly large allelic imbalances. Another option is to use pseudocounts or Bayesian estimators to reduce the variance. To this end, we evaluated the accuracy of four different estimators, the latter two of which are Bayesian shrinkage estimators: maximum likelihood, adding a pseudocount to each allele, approximate posterior estimation of GLM coefficients (apeglm) and adaptive shrinkage (ash). We also wrote C++ code to quickly calculate ML and apeglm estimates and integrated it into the package. The four methods were evaluated on two simulations and one real data set. Apeglm consistently performed better than ML according to a variety of criteria, and generally outperformed use of pseudocounts as well. Ash also performed better than ML in one of the simulations, but in the other performance was more mixed. Finally, when compared to five other packages that also fit beta-binomial models, the package was substantially faster and more numerically reliable, making our package useful for quick and reliable analyses of allelic imbalance. is available as an R/Bioconductor package at http://bioconductor.org/packages/apeglm.

摘要

等位基因失衡发生在二倍体生物中一个基因的两个等位基因表达不同时,这可能表明两个染色体上顺式调控和表观遗传状态的重要差异。因此,准确量化一个基因的每个等位基因表达比例的能力引起了研究人员的极大兴趣。在存在小读取计数和/或样本量的情况下,这会变得具有挑战性,这可能导致等位基因表达比例的估计值具有很高的方差。研究人员传统上通过过滤具有小计数和样本的基因来处理这个问题。然而,这可能会无意中去除真正具有大等位基因失衡的重要基因。另一种选择是使用伪计数或贝叶斯估计器来降低方差。为此,我们评估了四种不同估计器的准确性,后两种是贝叶斯收缩估计器:最大似然、向每个等位基因添加伪计数、GLM 系数的近似后验估计(apeglm)和自适应收缩(ash)。我们还编写了 C++代码来快速计算 ML 和 apeglm 估计值,并将其集成到 包中。这四种方法在两个模拟和一个真实数据集上进行了评估。根据各种标准,apeglm 始终比 ML 表现更好,通常也比使用伪计数表现更好。在一个模拟中,ash 也比 ML 表现更好,但在另一个模拟中,性能更加复杂。最后,与其他五个也拟合贝塔二项式模型的软件包相比, 包在速度和数值可靠性方面都有很大的优势,使得我们的软件包非常适合快速可靠地分析等位基因失衡。 可在 http://bioconductor.org/packages/apeglm 作为 R/Bioconductor 软件包获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d412/7975129/080cbebdddc9/f1000research-8-30569-g0000.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验