Suppr超能文献

在两组差异表达研究中对错误发现进行强大且可解释的控制。

Powerful and interpretable control of false discoveries in two-group differential expression studies.

作者信息

Enjalbert-Courrech Nicolas, Neuvial Pierre

机构信息

Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS, UPS, F-31062 Toulouse Cedex 9, France.

出版信息

Bioinformatics. 2022 Nov 30;38(23):5214-5221. doi: 10.1093/bioinformatics/btac693.

Abstract

MOTIVATION

The standard approach for statistical inference in differential expression (DE) analyses is to control the false discovery rate (FDR). However, controlling the FDR does not in fact imply that the proportion of false discoveries is upper bounded. Moreover, no statistical guarantee can be given on subsets of genes selected by FDR thresholding. These known limitations are overcome by post hoc inference, which provides guarantees of the number of proportion of false discoveries among arbitrary gene selections. However, post hoc inference methods are not yet widely used for DE studies.

RESULTS

In this article, we demonstrate the relevance and illustrate the performance of adaptive interpolation-based post hoc methods for two-group DE studies. First, we formalize the use of permutation-based methods to obtain sharp confidence bounds that are adaptive to the dependence between genes. Then, we introduce a generic linear time algorithm for computing post hoc bounds, making these bounds applicable to large-scale two-group DE studies. The use of the resulting Adaptive Simes bound is illustrated on a RNA sequencing study. Comprehensive numerical experiments based on real microarray and RNA sequencing data demonstrate the statistical performance of the method.

AVAILABILITY AND IMPLEMENTATION

A cross-platform open source implementation within the R package sanssouci is available at https://sanssouci-org.github.io/sanssouci/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

差异表达(DE)分析中统计推断的标准方法是控制错误发现率(FDR)。然而,控制FDR实际上并不意味着错误发现的比例有上限。此外,对于通过FDR阈值选择的基因子集,无法给出统计保证。事后推断克服了这些已知的局限性,它能保证任意基因选择中错误发现的数量或比例。然而,事后推断方法尚未广泛用于DE研究。

结果

在本文中,我们展示了基于自适应插值的事后方法在两组DE研究中的相关性,并说明了其性能。首先,我们规范了基于置换的方法的使用,以获得适应基因间依赖性的精确置信区间。然后,我们引入了一种通用的线性时间算法来计算事后区间,使这些区间适用于大规模两组DE研究。在一项RNA测序研究中展示了所得自适应西姆斯区间的应用。基于真实微阵列和RNA测序数据的综合数值实验证明了该方法的统计性能。

可用性和实现

R包sanssouci中的跨平台开源实现可在https://sanssouci-org.github.io/sanssouci/获取。

补充信息

补充数据可在《生物信息学》在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验