Suppr超能文献

关于在 RNA 测序实验中利用 RNA 样本池化来优化成本和统计功效的研究。

On the utility of RNA sample pooling to optimize cost and statistical power in RNA sequencing experiments.

机构信息

Department of Data Analysis and Mathematical Modeling, Ghent University, Ghent, 9000, Belgium.

Department of Biomolecular Medicine, Ghent University, Ghent, 9000, Belgium.

出版信息

BMC Genomics. 2020 Apr 19;21(1):312. doi: 10.1186/s12864-020-6721-y.

Abstract

BACKGROUND

In gene expression studies, RNA sample pooling is sometimes considered because of budget constraints or lack of sufficient input material. Using microarray technology, RNA sample pooling strategies have been reported to optimize both the cost of data generation as well as the statistical power for differential gene expression (DGE) analysis. For RNA sequencing, with its different quantitative output in terms of counts and tunable dynamic range, the adequacy and empirical validation of RNA sample pooling strategies have not yet been evaluated. In this study, we comprehensively assessed the utility of pooling strategies in RNA-seq experiments using empirical and simulated RNA-seq datasets.

RESULT

The data generating model in pooled experiments is defined mathematically to evaluate the mean and variability of gene expression estimates. The model is further used to examine the trade-off between the statistical power of testing for DGE and the data generating costs. Empirical assessment of pooling strategies is done through analysis of RNA-seq datasets under various pooling and non-pooling experimental settings. Simulation study is also used to rank experimental scenarios with respect to the rate of false and true discoveries in DGE analysis. The results demonstrate that pooling strategies in RNA-seq studies can be both cost-effective and powerful when the number of pools, pool size and sequencing depth are optimally defined.

CONCLUSION

For high within-group gene expression variability, small RNA sample pools are effective to reduce the variability and compensate for the loss of the number of replicates. Unlike the typical cost-saving strategies, such as reducing sequencing depth or number of RNA samples (replicates), an adequate pooling strategy is effective in maintaining the power of testing DGE for genes with low to medium abundance levels, along with a substantial reduction of the total cost of the experiment. In general, pooling RNA samples or pooling RNA samples in conjunction with moderate reduction of the sequencing depth can be good options to optimize the cost and maintain the power.

摘要

背景

在基因表达研究中,由于预算限制或缺乏足够的输入材料,有时会考虑对 RNA 样本进行汇集。使用微阵列技术,已经报道了 RNA 样本汇集策略,以优化数据生成的成本以及差异基因表达(DGE)分析的统计功效。对于 RNA 测序,由于其在计数方面具有不同的定量输出和可调节的动态范围,因此尚未评估 RNA 样本汇集策略的充分性和经验验证。在这项研究中,我们使用经验和模拟的 RNA-seq 数据集全面评估了汇集策略在 RNA-seq 实验中的效用。

结果

汇集实验中的数据生成模型是通过数学定义的,用于评估基因表达估计值的均值和变异性。该模型进一步用于检查 DGE 测试的统计功效与数据生成成本之间的权衡。通过在各种汇集和非汇集实验设置下分析 RNA-seq 数据集来进行汇集策略的经验评估。还使用模拟研究来根据 DGE 分析中假阳性和真阳性发现的比率对实验情况进行排名。结果表明,当最佳定义了池的数量、池大小和测序深度时,RNA-seq 研究中的汇集策略既具有成本效益又具有功效。

结论

对于高组内基因表达变异性,小的 RNA 样本池可有效降低变异性并弥补重复次数的减少。与典型的节省成本策略(例如降低测序深度或 RNA 样本数量(重复))不同,适当的汇集策略可有效地维持对低到中等丰度水平的基因进行 DGE 测试的功效,同时大大降低实验的总成本。一般而言,汇集 RNA 样本或与适度降低测序深度相结合,都是优化成本和维持功效的不错选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e27/7168886/276d24565de1/12864_2020_6721_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验