• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于读段比对不确定性的 RNA-Seq 基因表达估计。

RNA-Seq gene expression estimation with read mapping uncertainty.

机构信息

Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, USA.

出版信息

Bioinformatics. 2010 Feb 15;26(4):493-500. doi: 10.1093/bioinformatics/btp692. Epub 2009 Dec 18.

DOI:10.1093/bioinformatics/btp692
PMID:20022975
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2820677/
Abstract

MOTIVATION

RNA-Seq is a promising new technology for accurately measuring gene expression levels. Expression estimation with RNA-Seq requires the mapping of relatively short sequencing reads to a reference genome or transcript set. Because reads are generally shorter than transcripts from which they are derived, a single read may map to multiple genes and isoforms, complicating expression analyses. Previous computational methods either discard reads that map to multiple locations or allocate them to genes heuristically.

RESULTS

We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNA-Seq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. Unlike previous methods, our method is capable of modeling non-uniform read distributions. Simulations with our method indicate that a read length of 20-25 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed.

摘要

动机

RNA-Seq 是一种很有前途的新技术,可以准确测量基因表达水平。使用 RNA-Seq 进行表达估计需要将相对较短的测序读取映射到参考基因组或转录本集。由于读取通常比它们衍生的转录本短,因此单个读取可能会映射到多个基因和异构体,从而使表达分析变得复杂。以前的计算方法要么丢弃映射到多个位置的读取,要么根据启发式方法将它们分配给基因。

结果

我们提出了一种生成式统计模型和相关的推理方法,以合理的方式处理读取映射不确定性。通过用真实的 RNA-Seq 数据参数化的模拟,我们表明我们的方法比以前的方法更准确。我们提高的准确性是通过使用统计模型处理读取映射不确定性和将基因表达水平估计为异构体表达水平之和的结果。与以前的方法不同,我们的方法能够对非均匀的读取分布进行建模。当测序通量固定时,使用我们的方法进行模拟表明,对于来自小鼠和玉米的 RNA-Seq 数据的基因水平表达估计,20-25 个碱基的读取长度是最优的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d114/2820677/f899542fcf00/btp692f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d114/2820677/dc9aca6d60d9/btp692f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d114/2820677/f899542fcf00/btp692f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d114/2820677/dc9aca6d60d9/btp692f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d114/2820677/f899542fcf00/btp692f2.jpg

相似文献

1
RNA-Seq gene expression estimation with read mapping uncertainty.基于读段比对不确定性的 RNA-Seq 基因表达估计。
Bioinformatics. 2010 Feb 15;26(4):493-500. doi: 10.1093/bioinformatics/btp692. Epub 2009 Dec 18.
2
Zea mays RNA-seq estimated transcript abundances are strongly affected by read mapping bias.玉米RNA测序估计的转录本丰度受读段比对偏差的强烈影响。
BMC Genomics. 2021 Apr 20;22(1):285. doi: 10.1186/s12864-021-07577-3.
3
Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads.利用多映射 RNA-seq reads 进行单倍型和异构体特异性表达估计。
Genome Biol. 2011;12(2):R13. doi: 10.1186/gb-2011-12-2-r13. Epub 2011 Feb 10.
4
TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads.TIGAR2:利用更长的RNA测序读段对转录本异构体表达进行灵敏且准确的估计。
BMC Genomics. 2014;15 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2164-15-S10-S5. Epub 2014 Dec 12.
5
Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq.使用非均匀读分布模型提高 RNA-Seq 中异构体表达推断。
Bioinformatics. 2011 Feb 15;27(4):502-8. doi: 10.1093/bioinformatics/btq696. Epub 2010 Dec 17.
6
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.通过对异构体和外显子特异性读段测序率进行建模来改进RNA测序表达估计。
BMC Bioinformatics. 2015 Oct 16;16:332. doi: 10.1186/s12859-015-0750-6.
7
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.RSEM:有或无参考基因组的 RNA-Seq 数据的准确转录本定量。
BMC Bioinformatics. 2011 Aug 4;12:323. doi: 10.1186/1471-2105-12-323.
8
Joint estimation of isoform expression and isoform-specific read distribution using multisample RNA-Seq data.利用多样本 RNA-Seq 数据联合估计异构体表达和异构体特异性读取分布。
Bioinformatics. 2014 Feb 15;30(4):506-13. doi: 10.1093/bioinformatics/btt704. Epub 2013 Dec 3.
9
Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression.基于层次分析的 RNA-seq 测序reads 提高了等位基因特异性表达的准确性。
Bioinformatics. 2018 Jul 1;34(13):2177-2184. doi: 10.1093/bioinformatics/bty078.
10
A fuzzy method for RNA-Seq differential expression analysis in presence of multireads.一种用于存在多重读取情况下RNA测序差异表达分析的模糊方法。
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):345. doi: 10.1186/s12859-016-1195-2.

引用本文的文献

1
Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.通过使用TranSigner准确分配长RNA测序读数来增强转录组表达定量。
Genome Biol. 2025 Aug 28;26(1):257. doi: 10.1186/s13059-025-03723-2.
2
RNA-viromics unveils diverse RNA viral communities in Large-billed crows and Northern Ravens.RNA病毒组学揭示了大嘴乌鸦和渡鸦体内多样的RNA病毒群落。
Virus Genes. 2025 Aug 23. doi: 10.1007/s11262-025-02182-y.
3
Integration of mRNA and miRNA Analysis Reveals the Regulation of Salt Stress Response in Rapeseed ( L.).

本文引用的文献

1
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.短DNA序列与人类基因组的超快速且内存高效比对。
Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. Epub 2009 Mar 4.
2
Statistical inferences for isoform expression in RNA-Seq.RNA测序中异构体表达的统计推断。
Bioinformatics. 2009 Apr 15;25(8):1026-32. doi: 10.1093/bioinformatics/btp113. Epub 2009 Feb 25.
3
RNA-Seq: a revolutionary tool for transcriptomics.RNA测序:转录组学的革命性工具。
mRNA与miRNA分析的整合揭示了油菜(L.)盐胁迫响应的调控机制。
Plants (Basel). 2025 Aug 4;14(15):2418. doi: 10.3390/plants14152418.
4
Integrative analysis of lung adenocarcinoma across diverse ethnicities and exposures.不同种族和暴露因素下肺腺癌的综合分析。
Cancer Cell. 2025 Jul 30. doi: 10.1016/j.ccell.2025.07.011.
5
Starship giant transposons dominate plastic genomic regions in a fungal plant pathogen and drive virulence evolution.星舰巨型转座子在一种真菌植物病原体中主导可塑性基因组区域并推动毒力进化。
Nat Commun. 2025 Jul 24;16(1):6806. doi: 10.1038/s41467-025-61986-6.
6
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.
7
Boolean Network Modeling Identifies Cognitive Resilience in the First Murine Model of Asymptomatic Alzheimer's Disease.布尔网络建模在首例无症状阿尔茨海默病小鼠模型中识别出认知恢复力。
bioRxiv. 2025 Jun 13:2025.06.11.659207. doi: 10.1101/2025.06.11.659207.
8
Jasmonic acid-mediated cell wall biosynthesis pathway involved in pepper (Capsicum annuum) defense response to Ralstonia solanacearum.茉莉酸介导的细胞壁生物合成途径参与辣椒(Capsicum annuum)对青枯雷尔氏菌(Ralstonia solanacearum)的防御反应。
BMC Plant Biol. 2025 Jul 2;25(1):804. doi: 10.1186/s12870-025-06784-4.
9
Integrated Analysis of Differential Expression Profiles of miRNA and mRNA in Gonads of Provides New Insights into Sexually Biased Gene Expression.对[物种名称]性腺中 miRNA 和 mRNA 差异表达谱的综合分析为性别偏向性基因表达提供了新见解。
Animals (Basel). 2025 May 27;15(11):1564. doi: 10.3390/ani15111564.
10
Improving gene isoform quantification with miniQuant.使用miniQuant改进基因异构体定量分析。
Nat Biotechnol. 2025 Jun 3. doi: 10.1038/s41587-025-02633-9.
Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.
4
Cross-hybridization modeling on Affymetrix exon arrays.在Affymetrix外显子芯片上的交叉杂交建模
Bioinformatics. 2008 Dec 15;24(24):2887-93. doi: 10.1093/bioinformatics/btn571. Epub 2008 Nov 4.
5
Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.来自高通量DNA测序的超短读长数据集存在大量偏差。
Nucleic Acids Res. 2008 Sep;36(16):e105. doi: 10.1093/nar/gkn425. Epub 2008 Jul 26.
6
Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing.使用随机引物cDNA和大规模平行短读测序对HeLa S3转录组进行分析。
Biotechniques. 2008 Jul;45(1):81-94. doi: 10.2144/000112900.
7
RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.RNA测序:技术可重复性评估及与基因表达阵列的比较
Genome Res. 2008 Sep;18(9):1509-17. doi: 10.1101/gr.079558.108. Epub 2008 Jun 11.
8
Stem cell transcriptome profiling via massive-scale mRNA sequencing.通过大规模mRNA测序进行干细胞转录组分析。
Nat Methods. 2008 Jul;5(7):613-9. doi: 10.1038/nmeth.1223. Epub 2008 May 30.
9
Mapping and quantifying mammalian transcriptomes by RNA-Seq.通过RNA测序对哺乳动物转录组进行定位和定量分析。
Nat Methods. 2008 Jul;5(7):621-8. doi: 10.1038/nmeth.1226. Epub 2008 May 30.
10
The transcriptional landscape of the yeast genome defined by RNA sequencing.通过RNA测序定义的酵母基因组转录图谱。
Science. 2008 Jun 6;320(5881):1344-9. doi: 10.1126/science.1158441. Epub 2008 May 1.