• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Winsorization 极大地减少了分析人类群体样本时常用的差异表达方法中的假阳性。

Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples.

机构信息

Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, 55905, USA.

Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA.

出版信息

Genome Biol. 2024 Oct 30;25(1):282. doi: 10.1186/s13059-024-03230-w.

DOI:10.1186/s13059-024-03230-w
PMID:39478636
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11523781/
Abstract

A recent study found severely inflated type I error rates for DESeq2 and edgeR, two dominant tools used for differential expression analysis of RNA-seq data. Here, we show that by properly addressing the outliers in the RNA-Seq data using winsorization, the type I error rate of DESeq2 and edgeR can be substantially reduced, and the power is comparable to Wilcoxon rank-sum test for large datasets. Therefore, as an alternative to Wilcoxon rank-sum test, they may still be applied for differential expression analysis of large RNA-Seq datasets.

摘要

最近的一项研究发现,DESeq2 和 edgeR 的 I 型错误率严重膨胀,这两种工具是用于 RNA-seq 数据差异表达分析的主要工具。在这里,我们表明,通过使用 winsorization 正确处理 RNA-Seq 数据中的异常值,可以大大降低 DESeq2 和 edgeR 的 I 型错误率,并且对于大型数据集,其功效可与 Wilcoxon 秩和检验相媲美。因此,作为 Wilcoxon 秩和检验的替代方法,它们仍可用于大型 RNA-Seq 数据集的差异表达分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f25/11523781/2af21073adf4/13059_2024_3230_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f25/11523781/1607c63786d4/13059_2024_3230_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f25/11523781/2af21073adf4/13059_2024_3230_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f25/11523781/1607c63786d4/13059_2024_3230_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f25/11523781/2af21073adf4/13059_2024_3230_Fig2_HTML.jpg

相似文献

1
Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples.Winsorization 极大地减少了分析人类群体样本时常用的差异表达方法中的假阳性。
Genome Biol. 2024 Oct 30;25(1):282. doi: 10.1186/s13059-024-03230-w.
2
Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples".回应“在半合成 RNA-seq 数据模拟中忽略归一化影响会产生人为的假阳性”和“在分析人类群体样本时,流行的差异表达方法通过峰度化极大地减少了假阳性”。
Genome Biol. 2024 Oct 30;25(1):283. doi: 10.1186/s13059-024-03232-8.
3
Exaggerated false positives by popular differential expression methods when analyzing human population samples.分析人类群体样本时,常用差异表达方法会导致假阳性结果夸大。
Genome Biol. 2022 Mar 15;23(1):79. doi: 10.1186/s13059-022-02648-4.
4
Neglecting the impact of normalization in semi-synthetic RNA-seq data simulations generates artificial false positives.在半合成 RNA-seq 数据模拟中忽略标准化的影响会产生人为的假阳性。
Genome Biol. 2024 Oct 30;25(1):281. doi: 10.1186/s13059-024-03231-9.
5
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data.使用 Spike-in 和模拟数据进行 RNA-seq 差异表达分析方法的基准测试。
PLoS One. 2020 Apr 30;15(4):e0232271. doi: 10.1371/journal.pone.0232271. eCollection 2020.
6
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.
7
Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods.RNA-Seq 差异表达分析工具的基准测试:基于标准化与基于对数比变换的方法。
BMC Bioinformatics. 2018 Jul 18;19(1):274. doi: 10.1186/s12859-018-2261-8.
8
Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies.RNA-seq 研究中平衡两组比较差异基因表达分析的库大小标准化和统计方法选择。
BMC Genomics. 2020 Jan 28;21(1):75. doi: 10.1186/s12864-020-6502-7.
9
Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2.三种 RNA 测序差异表达分析方法:limma、EdgeR、DESeq2。
J Vis Exp. 2021 Sep 18(175). doi: 10.3791/62528.
10
Differential Expression Analysis in Single-Cell Transcriptomics.单细胞转录组学中的差异表达分析
Methods Mol Biol. 2019;1979:425-432. doi: 10.1007/978-1-4939-9240-9_25.

引用本文的文献

1
Associations of lactate-to-hematocrit ratio with short- and long-term prognoses in critically ill patients with cirrhosis and sepsis: a retrospective cohort study.肝硬化合并脓毒症重症患者乳酸与血细胞比容比值与短期和长期预后的相关性:一项回顾性队列研究
BMC Infect Dis. 2025 Sep 2;25(1):1095. doi: 10.1186/s12879-025-11527-9.
2
Non-linear association between serum levels of vitamins A and B12 and accelerated epigenetic aging.血清维生素A和B12水平与表观遗传加速衰老之间的非线性关联。
Front Nutr. 2025 Jul 28;12:1599205. doi: 10.3389/fnut.2025.1599205. eCollection 2025.
3
Machine Learning for the Prediction of Acute Kidney Injury in Critically Ill Patients With Coronary Heart Disease: Algorithm Development and Validation.

本文引用的文献

1
Exaggerated false positives by popular differential expression methods when analyzing human population samples.分析人类群体样本时,常用差异表达方法会导致假阳性结果夸大。
Genome Biol. 2022 Mar 15;23(1):79. doi: 10.1186/s13059-022-02648-4.
2
An omnibus test for differential distribution analysis of microbiome sequencing data.一种用于微生物组测序数据差异分布分析的集成测试方法。
Bioinformatics. 2018 Feb 15;34(4):643-651. doi: 10.1093/bioinformatics/btx650.
用于预测冠心病重症患者急性肾损伤的机器学习:算法开发与验证
JMIR Med Inform. 2025 May 28;13:e72349. doi: 10.2196/72349.
4
Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples".回应“在半合成 RNA-seq 数据模拟中忽略归一化影响会产生人为的假阳性”和“在分析人类群体样本时,流行的差异表达方法通过峰度化极大地减少了假阳性”。
Genome Biol. 2024 Oct 30;25(1):283. doi: 10.1186/s13059-024-03232-8.