• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自适应修剪均值和多参考对 RNA-Seq 数据进行标准化。

Normalization of RNA-Seq data using adaptive trimmed mean with multi-reference.

机构信息

School of Life Sciences, Gwangju Institute of Science and Technology, 123 Cheomdan-gwagiro, 61005, Gwangju, South Korea.

出版信息

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae241.

DOI:10.1093/bib/bbae241
PMID:38770720
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11107385/
Abstract

The normalization of RNA sequencing data is a primary step for downstream analysis. The most popular method used for the normalization is the trimmed mean of M values (TMM) and DESeq. The TMM tries to trim away extreme log fold changes of the data to normalize the raw read counts based on the remaining non-deferentially expressed genes. However, the major problem with the TMM is that the values of trimming factor M are heuristic. This paper tries to estimate the adaptive value of M in TMM based on Jaeckel's Estimator, and each sample acts as a reference to find the scale factor of each sample. The presented approach is validated on SEQC, MAQC2, MAQC3, PICKRELL and two simulated datasets with two-group and three-group conditions by varying the percentage of differential expression and the number of replicates. The performance of the present approach is compared with various state-of-the-art methods, and it is better in terms of area under the receiver operating characteristic curve and differential expression.

摘要

RNA 测序数据的标准化是下游分析的首要步骤。最常用的标准化方法是 trimmed mean of M values (TMM) 和 DESeq。TMM 试图通过剔除数据的极端对数倍数变化来标准化原始读取计数,基于剩余的非差异表达基因。然而,TMM 的主要问题是修剪因子 M 的值是启发式的。本文试图根据 Jaeckel 的估计器来估计 TMM 中的 M 的适应性值,每个样本作为一个参考来找到每个样本的比例因子。该方法在 SEQC、MAQC2、MAQC3、PICKRELL 和两个具有两组和三组条件的模拟数据集上进行了验证,通过改变差异表达的百分比和重复数来进行。本方法的性能与各种最先进的方法进行了比较,在接收者操作特征曲线和差异表达方面表现更好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/69053b39450e/bbae241f14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/a7e102d01503/bbae241f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/4c7d34608842/bbae241f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/866104628475/bbae241f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/16fa21a3f908/bbae241f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/a2deb6eed516/bbae241f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/8d99b7ae67f1/bbae241f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/b2922a7a3eaf/bbae241f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/08aa40958631/bbae241f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/24b0fa142c2b/bbae241f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/7c6523b30c1a/bbae241f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/3e631a4556e3/bbae241f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/180b505f850a/bbae241f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/59f849dc34d9/bbae241f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/69053b39450e/bbae241f14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/a7e102d01503/bbae241f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/4c7d34608842/bbae241f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/866104628475/bbae241f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/16fa21a3f908/bbae241f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/a2deb6eed516/bbae241f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/8d99b7ae67f1/bbae241f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/b2922a7a3eaf/bbae241f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/08aa40958631/bbae241f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/24b0fa142c2b/bbae241f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/7c6523b30c1a/bbae241f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/3e631a4556e3/bbae241f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/180b505f850a/bbae241f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/59f849dc34d9/bbae241f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95c6/11107385/69053b39450e/bbae241f14.jpg

相似文献

1
Normalization of RNA-Seq data using adaptive trimmed mean with multi-reference.使用自适应修剪均值和多参考对 RNA-Seq 数据进行标准化。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae241.
2
scTsI: an effective two-stage imputation method for single-cell RNA-seq data.scTsI:一种用于单细胞RNA测序数据的有效两阶段插补方法。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf298.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Enabling scalable single-cell transcriptomic analysis through distributed computing with Apache spark.通过使用Apache Spark进行分布式计算实现可扩展的单细胞转录组分析。
Sci Rep. 2025 Jul 29;15(1):27713. doi: 10.1038/s41598-025-12897-5.
5
Reference Vector-guided Evolutionary Algorithm for cluster analysis of single-cell transcriptomes.用于单细胞转录组聚类分析的参考向量引导进化算法
Comput Methods Programs Biomed. 2025 Sep;269:108873. doi: 10.1016/j.cmpb.2025.108873. Epub 2025 Jun 6.
6
Molecular feature-based classification of retroperitoneal liposarcoma: a prospective cohort study.基于分子特征的腹膜后脂肪肉瘤分类:一项前瞻性队列研究。
Elife. 2025 May 23;14:RP100887. doi: 10.7554/eLife.100887.
7
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
8
DiSC: a statistical tool for fast differential expression analysis of individual-level single-cell RNA-seq data.DiSC:一种用于个体水平单细胞RNA测序数据快速差异表达分析的统计工具。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf327.
9
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
10
Three-dimensional saline infusion sonography compared to two-dimensional saline infusion sonography for the diagnosis of focal intracavitary lesions.三维盐水灌注超声与二维盐水灌注超声在诊断腔内局灶性病变中的比较。
Cochrane Database Syst Rev. 2017 May 5;5(5):CD011126. doi: 10.1002/14651858.CD011126.pub2.

引用本文的文献

1
Transcriptomic signatures of prostate cancer progression: a comprehensive RNA-seq study.前列腺癌进展的转录组特征:一项全面的RNA测序研究。
3 Biotech. 2025 May;15(5):135. doi: 10.1007/s13205-025-04297-3. Epub 2025 Apr 19.
2
Gene Co-Expression Analysis Reveals Functional Differences Between Early- and Late-Onset Alzheimer's Disease.基因共表达分析揭示早发性和晚发性阿尔茨海默病之间的功能差异。
Curr Issues Mol Biol. 2025 Mar 18;47(3):200. doi: 10.3390/cimb47030200.
3
MBCdeg4: A modified clustering-based method for identifying differentially expressed genes from RNA-seq data.

本文引用的文献

1
Gene Expression Data Analysis Using Feature Weighted Robust Fuzzy c-Means Clustering.使用特征加权稳健模糊c均值聚类的基因表达数据分析
IEEE Trans Nanobioscience. 2022 Mar 8;PP. doi: 10.1109/TNB.2022.3157396.
2
Early prediction of preeclampsia in pregnancy with cell-free RNA.用游离细胞 RNA 对妊娠子痫前期进行早期预测。
Nature. 2022 Feb;602(7898):689-694. doi: 10.1038/s41586-022-04410-z. Epub 2022 Feb 9.
3
Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data.从 RNA-seq 数据构建基因共表达网络的稳健归一化和转换技术。
MBCdeg4:一种基于聚类的改进方法,用于从RNA测序数据中识别差异表达基因。
MethodsX. 2024 Dec 30;14:103149. doi: 10.1016/j.mex.2024.103149. eCollection 2025 Jun.
Genome Biol. 2022 Jan 3;23(1):1. doi: 10.1186/s13059-021-02568-9.
4
Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data.基于模型的基因聚类算法在 RNA-seq 数据中的差异表达分析。
BMC Bioinformatics. 2021 Oct 20;22(1):511. doi: 10.1186/s12859-021-04438-4.
5
Differential abundance testing on single-cell data using k-nearest neighbor graphs.基于 k-最近邻图的单细胞数据差异丰度检验。
Nat Biotechnol. 2022 Feb;40(2):245-253. doi: 10.1038/s41587-021-01033-z. Epub 2021 Sep 30.
6
Type-2 Fuzzy PCA Approach in Extracting Salient Features for Molecular Cancer Diagnostics and Prognostics.基于 2 型模糊 PCA 的方法用于提取分子癌症诊断和预后的显著特征。
IEEE Trans Nanobioscience. 2019 Jul;18(3):482-489. doi: 10.1109/TNB.2019.2917814. Epub 2019 May 20.
7
Smooth quantile normalization.平滑分位数归一化
Biostatistics. 2018 Apr 1;19(2):185-198. doi: 10.1093/biostatistics/kxx028.
8
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
9
Reproducible RNA-seq analysis using recount2.使用recount2进行可重复的RNA测序分析。
Nat Biotechnol. 2017 Apr 11;35(4):319-321. doi: 10.1038/nbt.3838.
10
Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions.从假设的角度选择样本间 RNA-Seq 标准化方法。
Brief Bioinform. 2018 Sep 28;19(5):776-792. doi: 10.1093/bib/bbx008.