Suppr超能文献

用于RNA测序计数数据的高效批次效应校正方法。

Highly effective batch effect correction method for RNA-seq count data.

作者信息

Zhang Xiaoyu

机构信息

Department of Computer Science and Information Science, California State University San Marcos, 333 S. Twin Oaks Valley Rd, San Marcos, CA 92096, USA.

出版信息

Comput Struct Biotechnol J. 2024 Dec 16;27:58-64. doi: 10.1016/j.csbj.2024.12.010. eCollection 2025.

Abstract

RNA sequencing (RNA-seq) has become a cornerstone of transcriptomics, providing detailed insights into gene expression across diverse biological conditions and sample types. However, RNA-seq data are often confounded by batch effects, systematic non-biological variations that compromise data reliability and obscure true biological differences. To address these challenges, we introduce ComBat-ref, a refined batch effect correction method designed to enhance the statistical power and reliability of differential expression analysis in RNA-seq data. Building on the principles of ComBat-seq, ComBat-ref employs a negative binomial model for count data adjustment but innovates by selecting a reference batch with the smallest dispersion, preserving count data for the reference batch, and adjusting other batches towards the reference batch. Our method demonstrated superior performance in both simulated environments and real-world datasets, including the growth factor receptor network (GFRN) data and NASA GeneLab transcriptomic datasets, significantly improving sensitivity and specificity compared to existing methods. By effectively mitigating batch effects while maintaining high detection power, ComBat-ref provides a robust solution for improving the accuracy and interpretability of RNA-seq data analyses.

摘要

RNA测序(RNA-seq)已成为转录组学的基石,它能深入洞察不同生物学条件和样本类型下的基因表达情况。然而,RNA-seq数据常常受到批次效应的干扰,这种系统性的非生物学变异会影响数据的可靠性,并掩盖真正的生物学差异。为应对这些挑战,我们引入了ComBat-ref,这是一种经过改进的批次效应校正方法,旨在提高RNA-seq数据中差异表达分析的统计功效和可靠性。基于ComBat-seq的原理,ComBat-ref采用负二项式模型进行计数数据调整,但它的创新之处在于选择离散度最小的参考批次,保留参考批次的计数数据,并将其他批次的数据调整至与参考批次一致。我们的方法在模拟环境和真实数据集(包括生长因子受体网络(GFRN)数据和美国国家航空航天局基因实验室转录组数据集)中均表现出卓越的性能,与现有方法相比,显著提高了灵敏度和特异性。通过在保持高检测能力的同时有效减轻批次效应,ComBat-ref为提高RNA-seq数据分析的准确性和可解释性提供了一个可靠的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd7/11718288/4f57db2f54af/ga1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验