• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

aFold - 使用多项式不确定性建模进行 RNA 测序数据的差异基因表达估计。

aFold - using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data.

机构信息

Evolutionary Ecology and Genetics, Zoological Institute, CAU Kiel, Am Botanischen Garten 9, 24118, Kiel, Germany.

Institute for Clinical Molecular Biology, CAU Kiel, Am Botanischen Garten 11, 24118, Kiel, Germany.

出版信息

BMC Genomics. 2019 May 10;20(1):364. doi: 10.1186/s12864-019-5686-1.

DOI:10.1186/s12864-019-5686-1
PMID:31077153
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6509820/
Abstract

BACKGROUND

Data normalization and identification of significant differential expression represent crucial steps in RNA-Seq analysis. Many available tools rely on assumptions that are often not met by real data, including the common assumption of symmetrical distribution of up- and down-regulated genes, the presence of only few differentially expressed genes and/or few outliers. Moreover, the cut-off for selecting significantly differentially expressed genes for further downstream analysis often depend on arbitrary choices.

RESULTS

We here introduce a new tool for estimating differential expression in noisy real-life data. It employs a novel normalization procedure (qtotal), which takes account of the overall distribution of read counts for data standardization enhancing reliable identification of differential gene expression, especially in case of asymmetrical distributions of up- and downregulated genes. The tool then introduces a polynomial algorithm (aFold) to model the uncertainty of read counts across treatments and genes. We extensively benchmark aFold on a variety of simulated and validated real-life data sets (e.g. ABRF, SEQC and MAQC-II) and show a higher ability to correctly identify differentially expressed genes under most tested conditions. aFold infers fold change values that are comparable across experiments, thereby facilitating data clustering, visualization, and other downstream applications.

CONCLUSIONS

We here present a new transcriptomics analysis tool that includes both a data normalization method and a differential expression analysis approach. The new tool is shown to enhance reliable identification of significant differential expression across distinct data distributions. It outcompetes alternative procedures in case of asymmetrical distributions of up- versus down-regulated genes and also the presence of outliers, all common to real data sets.

摘要

背景

数据标准化和显著差异表达的鉴定是 RNA-Seq 分析的关键步骤。许多现有的工具都依赖于一些假设,而这些假设往往不符合实际数据,包括上调和下调基因分布对称、只有少数差异表达基因和/或少数离群值的常见假设。此外,选择用于进一步下游分析的显著差异表达基因的截止值通常取决于任意选择。

结果

我们在这里介绍了一种用于估计嘈杂实际数据中差异表达的新工具。它采用了一种新的标准化程序(qtotal),考虑了读频数的总体分布,以增强差异基因表达的可靠鉴定,特别是在上调和下调基因分布不对称的情况下。然后,该工具引入了一种多项式算法(aFold)来模拟处理和基因之间读频数的不确定性。我们在各种模拟和验证的实际数据集(例如 ABRF、SEQC 和 MAQC-II)上广泛地对 aFold 进行基准测试,并在大多数测试条件下显示出更高的正确鉴定差异表达基因的能力。aFold 推断的倍数变化值在实验之间具有可比性,从而促进了数据聚类、可视化和其他下游应用。

结论

我们在这里提出了一种新的转录组学分析工具,它包括数据标准化方法和差异表达分析方法。新工具显示,在不同的数据分布中,可靠地鉴定显著差异表达的能力得到了增强。在存在不对称的上调和下调基因分布以及离群值的情况下,它优于替代程序,这些都是实际数据集的常见情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/8796a9030c5e/12864_2019_5686_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/15973efc6520/12864_2019_5686_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/ec1937c35a40/12864_2019_5686_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/cedd7d1c24a7/12864_2019_5686_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/a49928be4d2b/12864_2019_5686_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/8a8c948e0ab2/12864_2019_5686_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/faf912e68bdb/12864_2019_5686_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/07661b878597/12864_2019_5686_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/8796a9030c5e/12864_2019_5686_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/15973efc6520/12864_2019_5686_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/ec1937c35a40/12864_2019_5686_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/cedd7d1c24a7/12864_2019_5686_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/a49928be4d2b/12864_2019_5686_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/8a8c948e0ab2/12864_2019_5686_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/faf912e68bdb/12864_2019_5686_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/07661b878597/12864_2019_5686_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcec/6509820/8796a9030c5e/12864_2019_5686_Fig8_HTML.jpg

相似文献

1
aFold - using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data.aFold - 使用多项式不确定性建模进行 RNA 测序数据的差异基因表达估计。
BMC Genomics. 2019 May 10;20(1):364. doi: 10.1186/s12864-019-5686-1.
2
A fuzzy method for RNA-Seq differential expression analysis in presence of multireads.一种用于存在多重读取情况下RNA测序差异表达分析的模糊方法。
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):345. doi: 10.1186/s12859-016-1195-2.
3
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
4
Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods.RNA-Seq 差异表达分析工具的基准测试:基于标准化与基于对数比变换的方法。
BMC Bioinformatics. 2018 Jul 18;19(1):274. doi: 10.1186/s12859-018-2261-8.
5
ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences.ABSSeq:一种基于对绝对表达差异进行建模的新型RNA测序分析方法。
BMC Genomics. 2016 Aug 4;17:541. doi: 10.1186/s12864-016-2848-2.
6
Count ratio model reveals bias affecting NGS fold changes.计数比率模型揭示了影响NGS倍数变化的偏差。
Nucleic Acids Res. 2015 Nov 16;43(20):e136. doi: 10.1093/nar/gkv696. Epub 2015 Jul 8.
7
Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis.用于鉴定 RNA-seq 差异分析关键成分的合成数据集。
Brief Bioinform. 2018 Jan 1;19(1):65-76. doi: 10.1093/bib/bbw092.
8
Differential gene expression analysis using coexpression and RNA-Seq data.基于共表达和 RNA-Seq 数据的差异基因表达分析。
Bioinformatics. 2013 Sep 1;29(17):2153-61. doi: 10.1093/bioinformatics/btt363. Epub 2013 Jun 21.
9
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data.使用 Spike-in 和模拟数据进行 RNA-seq 差异表达分析方法的基准测试。
PLoS One. 2020 Apr 30;15(4):e0232271. doi: 10.1371/journal.pone.0232271. eCollection 2020.
10
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.

引用本文的文献

1
Patterns of extreme outlier gene expression suggest an edge of chaos effect in transcriptomic networks.极端离群基因表达模式表明转录组网络中存在混沌边缘效应。
Genome Biol. 2025 Sep 9;26(1):272. doi: 10.1186/s13059-025-03709-0.
2
High CTLA-4 gene expression is an independent good prognosis factor in breast cancer patients, especially in the HER2-enriched subtype.高CTLA-4基因表达是乳腺癌患者尤其是HER2富集亚型患者的独立良好预后因素。
Cancer Cell Int. 2024 Nov 10;24(1):371. doi: 10.1186/s12935-024-03554-4.
3
Leveraging explainable deep learning methodologies to elucidate the biological underpinnings of Huntington's disease using single-cell RNA sequencing data.

本文引用的文献

1
ROTS: An R package for reproducibility-optimized statistical testing.ROTS:一个用于优化可重复性统计检验的R软件包。
PLoS Comput Biol. 2017 May 25;13(5):e1005562. doi: 10.1371/journal.pcbi.1005562. eCollection 2017 May.
2
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
3
ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences.
利用可解释的深度学习方法,利用单细胞 RNA 测序数据阐明亨廷顿病的生物学基础。
BMC Genomics. 2024 Oct 4;25(1):930. doi: 10.1186/s12864-024-10855-5.
4
Predicting weighted unobserved nodes in a regulatory network using answer set programming.使用解答集规划预测调控网络中的加权未观测节点。
BMC Bioinformatics. 2023 Aug 25;24(Suppl 1):321. doi: 10.1186/s12859-023-05429-3.
5
Deep learning explains the biology of branched glycans from single-cell sequencing data.深度学习从单细胞测序数据中解析分支聚糖的生物学特性。
iScience. 2022 Sep 19;25(10):105163. doi: 10.1016/j.isci.2022.105163. eCollection 2022 Oct 21.
6
Integrative Computational Approach Revealed Crucial Genes Associated With Different Stages of Diabetic Retinopathy.整合计算方法揭示了与糖尿病视网膜病变不同阶段相关的关键基因。
Front Genet. 2020 Nov 12;11:576442. doi: 10.3389/fgene.2020.576442. eCollection 2020.
ABSSeq:一种基于对绝对表达差异进行建模的新型RNA测序分析方法。
BMC Genomics. 2016 Aug 4;17:541. doi: 10.1186/s12864-016-2848-2.
4
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
5
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.测序质量控制联盟对RNA测序准确性、可重复性和信息含量的全面评估。
Nat Biotechnol. 2014 Sep;32(9):903-14. doi: 10.1038/nbt.2957. Epub 2014 Aug 24.
6
Detecting and correcting systematic variation in large-scale RNA sequencing data.检测和校正大规模RNA测序数据中的系统变异。
Nat Biotechnol. 2014 Sep;32(9):888-95. doi: 10.1038/nbt.3000. Epub 2014 Aug 24.
7
Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study.在ABRF下一代测序研究中使用RNA测序对转录组图谱进行多平台评估。
Nat Biotechnol. 2014 Sep;32(9):915-925. doi: 10.1038/nbt.2972. Epub 2014 Aug 24.
8
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
9
Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data.RNA测序数据差异基因表达分析方法的综合评估
Genome Biol. 2013;14(9):R95. doi: 10.1186/gb-2013-14-9-r95.
10
A comparison of methods for differential expression analysis of RNA-seq data.RNA-seq 数据差异表达分析方法的比较。
BMC Bioinformatics. 2013 Mar 9;14:91. doi: 10.1186/1471-2105-14-91.