• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RNA-seq 分析中转录表达高度变化对差异表达基因鉴定的影响。

Effect of high variation in transcript expression on identifying differentially expressed genes in RNA-seq analysis.

机构信息

Key Laboratory of Biomedical Engineering & Technology of Shandong High School, Qilu Medical University, Zibo, P. R. China.

Xuzhou Medical University, Xuzhou, P. R. China.

出版信息

Ann Hum Genet. 2021 Nov;85(6):235-244. doi: 10.1111/ahg.12441. Epub 2021 Aug 3.

DOI:10.1111/ahg.12441
PMID:34341986
Abstract

Great efforts have been made on the algorithms that deal with RNA-seq data to enhance the accuracy and efficiency of differential expression (DE) analysis. However, no consensus has been reached on the proper threshold values of fold change and adjusted p-value for filtering differentially expressed genes (DEGs). It is generally believed that the more stringent the filtering threshold, the more reliable the result of a DE analysis. Nevertheless, by analyzing the impact of both adjusted p-value and fold change thresholds on DE analyses, with RNA-seq data obtained for three different cancer types from the Cancer Genome Atlas (TCGA) database, we found that, for a given sample size, the reproducibility of DE results became poorer when more stringent thresholds were applied. No matter which threshold level was applied, the overlap rates of DEGs were generally lower for small sample sizes than for large sample sizes. The raw read count analysis demonstrated that the transcript expression of the same gene in different samples, whether in tumor groups or in normal groups, showed high variations, which resulted in a drastic fluctuation in fold change values and adjustedp-values when different sets of samples were used. Overall, more stringent thresholds did not yield more reliable DEGs due to high variations in transcript expression; the reliability of DEGs obtained with small sample sizes was more susceptible to these variations. Therefore, less stringent thresholds are recommended for screening DEGs. Moreover, large sample sizes should be considered in RNA-seq experimental designs to reduce the interfering effect of variations in transcript expression on DEG identification.

摘要

人们在处理 RNA-seq 数据的算法上付出了巨大努力,以提高差异表达(DE)分析的准确性和效率。然而,对于过滤差异表达基因(DEGs)的折叠变化和调整后的 p 值的适当阈值值,尚未达成共识。一般认为,过滤阈值越严格,DE 分析的结果就越可靠。然而,通过分析调整后的 p 值和折叠变化阈值对 DE 分析的影响,我们使用来自癌症基因组图谱(TCGA)数据库的三种不同癌症类型的 RNA-seq 数据发现,对于给定的样本量,应用更严格的阈值会降低 DE 结果的重现性。无论应用哪个阈值水平,对于小样本量,DEGs 的重叠率通常低于大样本量。原始读取计数分析表明,同一基因在不同样本中的转录表达,无论是在肿瘤组还是在正常组中,都表现出高度的变化,这导致当使用不同的样本集时,折叠变化值和调整后的 p 值会出现剧烈波动。总体而言,由于转录表达的高度变化,更严格的阈值并没有产生更可靠的 DEGs;小样本量获得的 DEGs 的可靠性更容易受到这些变化的影响。因此,建议使用较不严格的阈值来筛选 DEGs。此外,在 RNA-seq 实验设计中应考虑较大的样本量,以减少转录表达变化对 DEG 识别的干扰影响。

相似文献

1
Effect of high variation in transcript expression on identifying differentially expressed genes in RNA-seq analysis.RNA-seq 分析中转录表达高度变化对差异表达基因鉴定的影响。
Ann Hum Genet. 2021 Nov;85(6):235-244. doi: 10.1111/ahg.12441. Epub 2021 Aug 3.
2
The Vacc-SeqQC project: Benchmarking RNA-Seq for clinical vaccine studies.Vacc-SeqQC 项目:基于 RNA-Seq 对临床疫苗研究进行基准测试。
Front Immunol. 2023 Jan 19;13:1093242. doi: 10.3389/fimmu.2022.1093242. eCollection 2022.
3
High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis.RNA-Seq 分析中差异表达结果的高度异质性破坏了其可推广性。
Hum Genomics. 2021 Jan 28;15(1):7. doi: 10.1186/s40246-021-00308-5.
4
A comparison of transcriptome analysis methods with reference genome.与参考基因组比较转录组分析方法。
BMC Genomics. 2022 Mar 25;23(1):232. doi: 10.1186/s12864-022-08465-0.
5
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.
6
Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies.RNA-seq 研究中平衡两组比较差异基因表达分析的库大小标准化和统计方法选择。
BMC Genomics. 2020 Jan 28;21(1):75. doi: 10.1186/s12864-020-6502-7.
7
Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data.基于模型的基因聚类算法在 RNA-seq 数据中的差异表达分析。
BMC Bioinformatics. 2021 Oct 20;22(1):511. doi: 10.1186/s12859-021-04438-4.
8
Effect of low-expression gene filtering on detection of differentially expressed genes in RNA-seq data.低表达基因过滤对RNA测序数据中差异表达基因检测的影响
Annu Int Conf IEEE Eng Med Biol Soc. 2015;2015:6461-4. doi: 10.1109/EMBC.2015.7319872.
9
RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes.RNA-seq 辅助工具:基于机器学习的方法,以鉴定更多受转录调控的基因。
BMC Genomics. 2018 Jul 20;19(1):546. doi: 10.1186/s12864-018-4932-2.
10
Parallel comparison of Illumina RNA-Seq and Affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated HT-29 colon cancer cells and simulated datasets.Illumina RNA-Seq 和 Affymetrix 微阵列平台在 5-aza-去氧胞苷处理的 HT-29 结肠癌细胞和模拟数据集产生的转录组图谱上的平行比较。
BMC Bioinformatics. 2013;14 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2105-14-S9-S1. Epub 2013 Jun 28.

引用本文的文献

1
Elevated Methylation Contributes to Suppressed Expression of Special AT-Rich Sequence-Binding Protein 2 in Colorectal Cancer: A Gene-Disease Association Study.甲基化水平升高导致结直肠癌中富含AT序列结合蛋白2表达受抑:一项基因-疾病关联研究
Health Sci Rep. 2025 Jul 9;8(7):e71056. doi: 10.1002/hsr2.71056. eCollection 2025 Jul.
2
Untangling the complex mechanisms associated with Alzheimer's disease in elderly patients using high-throughput RNA sequencing data and next-generation knowledge discovery methods: Focus on potential gene signatures and drugs for dementia.利用高通量RNA测序数据和新一代知识发现方法解析老年患者阿尔茨海默病的复杂机制:聚焦痴呆潜在基因特征和药物
Heliyon. 2024 Dec 18;11(1):e41266. doi: 10.1016/j.heliyon.2024.e41266. eCollection 2025 Jan 15.
3
NETest: serial liquid biopsies in gastroenteropancreatic NET surveillance.NET检测:胃肠胰神经内分泌肿瘤监测中的系列液体活检
Endocr Connect. 2022 Sep 7;11(10). doi: 10.1530/EC-22-0146. Print 2022 Oct 1.
4
Transcriptomic and proteomic retinal pigment epithelium signatures of age-related macular degeneration.转录组和蛋白质组视网膜色素上皮特征与年龄相关性黄斑变性。
Nat Commun. 2022 Jul 26;13(1):4233. doi: 10.1038/s41467-022-31707-4.