• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于模型的使用RNA测序数据进行基因表达调用的标准。

A model based criterion for gene expression calls using RNA-seq data.

作者信息

Wagner Günter P, Kin Koryu, Lynch Vincent J

机构信息

Yale Systems Biology Institute, 300 Heffernan Drive, West Haven, CT 06516, USA.

出版信息

Theory Biosci. 2013 Sep;132(3):159-64. doi: 10.1007/s12064-013-0178-3. Epub 2013 Apr 25.

DOI:10.1007/s12064-013-0178-3
PMID:23615947
Abstract

The power of deep sequencing technology to reliably detect single RNA reads leads to a paradoxical problem of high sensitivity. In hybridization or PCR based methods for RNA quantification, the concern is low sensitivity, i.e., the problem that the signal from truly expressed genes might not be distinguishable from noise. In contrast, the problem with RNA-seq is that it is not clear whether genes with very low read counts are from low expressed genes or merely transcriptional noise. The frequency distribution for read counts does not show a clear separation in two classes of genes, which makes the decision whether a gene is to be considered expressed or not seemingly arbitrary. Here we address this problem by suggesting a statistical model that considers the number of transcripts detected in a RNA-seq study as a mixture of two distributions: one is a exponential distribution for transcripts from inactive genes, and a negative binomial distribution for actively transcribed genes. We apply this model to a number of RNA-seq data sets and find that the model fits the data very well. The calculated criteria for distinguishing between expressed and non-expressed gene is remarkably consistent among data sets, suggesting genes with more than two transcripts per million transcripts (TPM) are highly likely from actively transcribed genes. This criterion is consistent with the criterion of 1 RPKM proposed by Hebenstreit et al. Mol Sys Biol 7:497 (2011), based on chromatin modification and per cell RNA expression data. Hence, the regression model correctly identifies the not actively expressed class of genes and thus, provides an operational criterion for classifying genes in expressed and non-expressed sets, facilitating the interpretation of RNA-seq data.

摘要

深度测序技术可靠检测单个RNA读数的能力导致了一个具有高灵敏度的矛盾问题。在基于杂交或PCR的RNA定量方法中,人们关注的是低灵敏度,即真正表达的基因发出的信号可能无法与噪声区分开来的问题。相比之下,RNA测序的问题在于,尚不清楚读数计数非常低的基因是来自低表达基因还是仅仅是转录噪声。读数计数的频率分布在两类基因中没有显示出明显的区分,这使得决定一个基因是否应被视为已表达似乎具有随意性。在此,我们通过提出一种统计模型来解决这个问题,该模型将RNA测序研究中检测到的转录本数量视为两种分布的混合:一种是来自不活跃基因的转录本的指数分布,另一种是活跃转录基因的负二项分布。我们将此模型应用于多个RNA测序数据集,发现该模型与数据拟合得非常好。区分已表达和未表达基因的计算标准在各数据集之间非常一致,这表明每百万转录本(TPM)中具有超过两个转录本的基因极有可能来自活跃转录基因。该标准与赫本施特赖特等人在《分子系统生物学》7:497(2011年)中基于染色质修饰和单细胞RNA表达数据提出的1 RPKM标准一致。因此,回归模型正确地识别出不活跃表达的基因类别,从而为将基因分类为已表达和未表达集合提供了一个操作标准,便于对RNA测序数据进行解读。

相似文献

1
A model based criterion for gene expression calls using RNA-seq data.一种基于模型的使用RNA测序数据进行基因表达调用的标准。
Theory Biosci. 2013 Sep;132(3):159-64. doi: 10.1007/s12064-013-0178-3. Epub 2013 Apr 25.
2
Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster.使用来自726只黑腹果蝇个体的RNA测序数据进行标准化和差异表达分析的比较。
BMC Genomics. 2016 Jan 5;17:28. doi: 10.1186/s12864-015-2353-z.
3
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.通过纳入非外显子映射读数对RNA测序数据进行差异表达分析。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11.
4
Bias detection and correction in RNA-Sequencing data.RNA 测序数据中的偏差检测和校正。
BMC Bioinformatics. 2011 Jul 19;12:290. doi: 10.1186/1471-2105-12-290.
5
Single-cell transcriptome analysis of endometrial tissue.子宫内膜组织的单细胞转录组分析
Hum Reprod. 2016 Apr;31(4):844-53. doi: 10.1093/humrep/dew008. Epub 2016 Feb 13.
6
Next-generation sequencing facilitates quantitative analysis of wild-type and Nrl(-/-) retinal transcriptomes.新一代测序技术有助于对野生型和Nrl基因敲除小鼠视网膜转录组进行定量分析。
Mol Vis. 2011;17:3034-54. Epub 2011 Nov 23.
7
Misuse of RPKM or TPM normalization when comparing across samples and sequencing protocols.在比较不同样本和测序方案时,滥用 RPKM 或 TPM 标准化。
RNA. 2020 Aug;26(8):903-909. doi: 10.1261/rna.074922.120. Epub 2020 Apr 13.
8
FDM: a graph-based statistical method to detect differential transcription using RNA-seq data.FDM:一种基于图的统计方法,用于检测使用 RNA-seq 数据的差异转录。
Bioinformatics. 2011 Oct 1;27(19):2633-40. doi: 10.1093/bioinformatics/btr458. Epub 2011 Aug 8.
9
Single read and paired end mRNA-Seq Illumina libraries from 10 nanograms total RNA.来自10纳克总RNA的单端和双端mRNA-Seq Illumina文库。
J Vis Exp. 2011 Oct 27(56):e3340. doi: 10.3791/3340.
10
A two-parameter generalized Poisson model to improve the analysis of RNA-seq data.一种用于改进RNA测序数据分析的双参数广义泊松模型。
Nucleic Acids Res. 2010 Sep;38(17):e170. doi: 10.1093/nar/gkq670. Epub 2010 Jul 29.

引用本文的文献

1
RNA-seq analysis of blood from cave- and surface-dwelling morphs reveal diverse transcriptomic responses to normoxic rearing.对洞穴型和地表型形态个体的血液进行RNA测序分析,揭示了对常氧饲养的多种转录组反应。
Front Physiol. 2025 Jul 17;16:1617136. doi: 10.3389/fphys.2025.1617136. eCollection 2025.
2
Identification of novel hepaciviruses and -associated viruses metatranscriptomics in North American lagomorphs.北美兔形目动物中新型肝炎病毒及相关病毒的鉴定:宏转录组学研究
Virus Evol. 2025 Jul 2;11(1):veaf050. doi: 10.1093/ve/veaf050. eCollection 2025.
3
Quantifying Transcriptome Turnover on Phylogenies by Modeling Gene Expression as a Binary Trait.

本文引用的文献

1
Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples.使用RNA测序数据测量mRNA丰度:每百万映射读取中来自某基因每千碱基长度的读取数(RPKM)测量在样本间不一致。
Theory Biosci. 2012 Dec;131(4):281-5. doi: 10.1007/s12064-012-0162-3. Epub 2012 Aug 8.
2
RNA sequencing reveals two major classes of gene expression levels in metazoan cells.RNA 测序揭示了后生动物细胞中两种主要的基因表达水平类别。
Mol Syst Biol. 2011 Jun 7;7:497. doi: 10.1038/msb.2011.28.
3
Differential expression analysis for sequence count data.差异表达分析序列计数数据。
通过将基因表达建模为二元性状来量化系统发育上的转录组周转率。
Mol Biol Evol. 2025 Apr 30;42(5). doi: 10.1093/molbev/msaf106.
4
Large-scale integration of meta-QTL and genome-wide association study identifies genomic regions and candidate genes for photosynthetic efficiency traits in bread wheat.元数量性状位点与全基因组关联研究的大规模整合确定了面包小麦光合效率性状的基因组区域和候选基因。
BMC Genomics. 2025 Mar 22;26(1):284. doi: 10.1186/s12864-025-11472-6.
5
Bioinformatics insights into ACSL1 and ACSL5: prognostic and immune roles in low-grade glioma.ACSL1和ACSL5的生物信息学见解:在低级别胶质瘤中的预后和免疫作用
BMC Cancer. 2025 Feb 10;25(1):226. doi: 10.1186/s12885-025-13651-w.
6
Melocular Evolution on Cold Temperature Adaptation of Chinese Rhesus Macaques.中国恒河猴低温适应的眼部进化
Curr Genomics. 2025;26(1):36-47. doi: 10.2174/0113892029301969240708094053. Epub 2024 Jul 10.
7
Genetic dissection of flag leaf morphology traits and fine mapping of a novel QTL (Qflw.sxau-6BL) in bread wheat (Triticum aestivum L.).面包小麦(Triticum aestivum L.)旗叶形态性状的遗传剖析及一个新QTL(Qflw.sxau-6BL)的精细定位
Theor Appl Genet. 2025 Jan 8;138(1):21. doi: 10.1007/s00122-024-04802-x.
8
Meta-QTL mapping for wheat thousand kernel weight.小麦千粒重的Meta-QTL定位
Front Plant Sci. 2024 Dec 16;15:1499055. doi: 10.3389/fpls.2024.1499055. eCollection 2024.
9
Exploitation of phylum-spanning omics resources reveals complexity in the nematode FLP signalling system and provides insights into flp-gene evolution.利用跨门的组学资源揭示了线虫FLP信号系统的复杂性,并为flp基因的进化提供了见解。
BMC Genomics. 2024 Dec 19;25(1):1220. doi: 10.1186/s12864-024-11111-6.
10
Tumor-associated antigen prediction using a single-sample gene expression state inference algorithm.基于单一样本基因表达状态推断算法的肿瘤相关抗原预测。
Cell Rep Methods. 2024 Nov 18;4(11):100906. doi: 10.1016/j.crmeth.2024.100906.
Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.
4
A scaling normalization method for differential expression analysis of RNA-seq data.RNA-seq 数据差异表达分析的缩放标准化方法。
Genome Biol. 2010;11(3):R25. doi: 10.1186/gb-2010-11-3-r25. Epub 2010 Mar 2.
5
RNA-Seq gene expression estimation with read mapping uncertainty.基于读段比对不确定性的 RNA-Seq 基因表达估计。
Bioinformatics. 2010 Feb 15;26(4):493-500. doi: 10.1093/bioinformatics/btp692. Epub 2009 Dec 18.
6
ChIP-seq accurately predicts tissue-specific activity of enhancers.染色质免疫沉淀测序(ChIP-seq)能准确预测增强子的组织特异性活性。
Nature. 2009 Feb 12;457(7231):854-8. doi: 10.1038/nature07730.
7
RNA-Seq: a revolutionary tool for transcriptomics.RNA测序:转录组学的革命性工具。
Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.
8
Mapping and quantifying mammalian transcriptomes by RNA-Seq.通过RNA测序对哺乳动物转录组进行定位和定量分析。
Nat Methods. 2008 Jul;5(7):621-8. doi: 10.1038/nmeth.1226. Epub 2008 May 30.
9
The transcriptional landscape of the yeast genome defined by RNA sequencing.通过RNA测序定义的酵母基因组转录图谱。
Science. 2008 Jun 6;320(5881):1344-9. doi: 10.1126/science.1158441. Epub 2008 May 1.
10
Transcriptional noise and the fidelity of initiation by RNA polymerase II.转录噪声与RNA聚合酶II起始转录的保真度
Nat Struct Mol Biol. 2007 Feb;14(2):103-5. doi: 10.1038/nsmb0207-103.