• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

混合模型揭示了RNA测序数据中的多种位置偏差类型,并能准确估计转录本浓度。

Mixture models reveal multiple positional bias types in RNA-Seq data and lead to accurate transcript concentration estimates.

作者信息

Tuerk Andreas, Wiktorin Gregor, Güler Serhat

机构信息

Lexogen GmbH, Vienna, Austria.

出版信息

PLoS Comput Biol. 2017 May 15;13(5):e1005515. doi: 10.1371/journal.pcbi.1005515. eCollection 2017 May.

DOI:10.1371/journal.pcbi.1005515
PMID:28505151
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5448817/
Abstract

Accuracy of transcript quantification with RNA-Seq is negatively affected by positional fragment bias. This article introduces Mix2 (rd. "mixquare"), a transcript quantification method which uses a mixture of probability distributions to model and thereby neutralize the effects of positional fragment bias. The parameters of Mix2 are trained by Expectation Maximization resulting in simultaneous transcript abundance and bias estimates. We compare Mix2 to Cufflinks, RSEM, eXpress and PennSeq; state-of-the-art quantification methods implementing some form of bias correction. On four synthetic biases we show that the accuracy of Mix2 overall exceeds the accuracy of the other methods and that its bias estimates converge to the correct solution. We further evaluate Mix2 on real RNA-Seq data from the Microarray and Sequencing Quality Control (MAQC, SEQC) Consortia. On MAQC data, Mix2 achieves improved correlation to qPCR measurements with a relative increase in R2 between 4% and 50%. Mix2 also yields repeatable concentration estimates across technical replicates with a relative increase in R2 between 8% and 47% and reduced standard deviation across the full concentration range. We further observe more accurate detection of differential expression with a relative increase in true positives between 74% and 378% for 5% false positives. In addition, Mix2 reveals 5 dominant biases in MAQC data deviating from the common assumption of a uniform fragment distribution. On SEQC data, Mix2 yields higher consistency between measured and predicted concentration ratios. A relative error of 20% or less is obtained for 51% of transcripts by Mix2, 40% of transcripts by Cufflinks and RSEM and 30% by eXpress. Titration order consistency is correct for 47% of transcripts for Mix2, 41% for Cufflinks and RSEM and 34% for eXpress. We, further, observe improved repeatability across laboratory sites with a relative increase in R2 between 8% and 44% and reduced standard deviation.

摘要

RNA测序中,转录本定量的准确性会受到片段位置偏差的负面影响。本文介绍了Mix2(读作“mixquare”),这是一种转录本定量方法,它使用概率分布的混合来建模,从而抵消片段位置偏差的影响。Mix2的参数通过期望最大化进行训练,从而同时估计转录本丰度和偏差。我们将Mix2与Cufflinks、RSEM、eXpress和PennSeq进行比较;这些都是实施某种形式偏差校正的最先进的定量方法。在四种合成偏差上,我们表明Mix2的准确性总体上超过了其他方法,并且其偏差估计收敛到正确的解决方案。我们进一步在来自微阵列和测序质量控制(MAQC,SEQC)联盟的真实RNA测序数据上评估Mix2。在MAQC数据上,Mix2与qPCR测量的相关性得到改善,R2相对增加4%至50%。Mix2在技术重复中也产生了可重复的浓度估计,R2相对增加8%至47%,并且在整个浓度范围内标准偏差降低。我们进一步观察到差异表达的检测更准确,对于5%的假阳性,真阳性相对增加74%至378%。此外,Mix2揭示了MAQC数据中5种主要偏差,这些偏差偏离了片段分布均匀的常见假设。在SEQC数据上,Mix2在测量浓度比和预测浓度比之间产生了更高的一致性。Mix2对51%的转录本获得了20%或更低的相对误差,Cufflinks和RSEM为40%的转录本,eXpress为30%。对于Mix2,47%的转录本滴定顺序一致性正确,Cufflinks和RSEM为41%,eXpress为34%。我们还观察到跨实验室站点的重复性得到改善,R2相对增加8%至44%,标准偏差降低。

相似文献

1
Mixture models reveal multiple positional bias types in RNA-Seq data and lead to accurate transcript concentration estimates.混合模型揭示了RNA测序数据中的多种位置偏差类型,并能准确估计转录本浓度。
PLoS Comput Biol. 2017 May 15;13(5):e1005515. doi: 10.1371/journal.pcbi.1005515. eCollection 2017 May.
2
Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation.RNA测序片段序列偏差的建模可减少转录本丰度估计中的系统误差。
Nat Biotechnol. 2016 Dec;34(12):1287-1291. doi: 10.1038/nbt.3682. Epub 2016 Sep 26.
3
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.RSEM:有或无参考基因组的 RNA-Seq 数据的准确转录本定量。
BMC Bioinformatics. 2011 Aug 4;12:323. doi: 10.1186/1471-2105-12-323.
4
PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.PennSeq:通过建模非均匀读取分布实现 RNA-Seq 中精确的异构体特异性基因表达定量。
Nucleic Acids Res. 2014 Feb;42(3):e20. doi: 10.1093/nar/gkt1304. Epub 2013 Dec 20.
5
WemIQ: an accurate and robust isoform quantification method for RNA-seq data.WemIQ:一种用于RNA测序数据的准确且稳健的异构体定量方法。
Bioinformatics. 2015 Mar 15;31(6):878-85. doi: 10.1093/bioinformatics/btu757. Epub 2014 Nov 17.
6
Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.草莓:基于RNA测序的快速且准确的基因组引导转录本重建与定量分析
PLoS Comput Biol. 2017 Nov 27;13(11):e1005851. doi: 10.1371/journal.pcbi.1005851. eCollection 2017 Nov.
7
Count ratio model reveals bias affecting NGS fold changes.计数比率模型揭示了影响NGS倍数变化的偏差。
Nucleic Acids Res. 2015 Nov 16;43(20):e136. doi: 10.1093/nar/gkv696. Epub 2015 Jul 8.
8
ORMAN: optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms.ORMAN:在存在新的异构体的情况下,实现 RNA-Seq 多重比对的最佳分辨率。
Bioinformatics. 2014 Mar 1;30(5):644-51. doi: 10.1093/bioinformatics/btt591. Epub 2013 Oct 15.
9
A robust method for transcript quantification with RNA-seq data.一种利用RNA测序数据进行转录本定量的可靠方法。
J Comput Biol. 2013 Mar;20(3):167-87. doi: 10.1089/cmb.2012.0230.
10
Bias and Correction in RNA-seq Data for Marine Species.海洋物种 RNA-seq 数据中的偏差与校正。
Mar Biotechnol (NY). 2017 Oct;19(5):541-550. doi: 10.1007/s10126-017-9773-5. Epub 2017 Sep 7.

引用本文的文献

1
Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets.使用单细胞 RNA 测序数据集对具有不同细胞大小的异质组织进行计算去卷积所面临的挑战和机遇。
Genome Biol. 2023 Dec 14;24(1):288. doi: 10.1186/s13059-023-03123-4.
2
RNA-seq data science: From raw data to effective interpretation.RNA测序数据科学:从原始数据到有效解读
Front Genet. 2023 Mar 13;14:997383. doi: 10.3389/fgene.2023.997383. eCollection 2023.
3
Comparative Analysis of Single-Cell RNA Sequencing Platforms and Methods.

本文引用的文献

1
A benchmark for RNA-seq quantification pipelines.RNA测序定量流程的一个基准。
Genome Biol. 2016 Apr 23;17:74. doi: 10.1186/s13059-016-0940-1.
2
Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package.使用NOISeq R/Bioc软件包对RNA测序中的差异表达进行数据质量感知分析。
Nucleic Acids Res. 2015 Dec 2;43(21):e140. doi: 10.1093/nar/gkv711. Epub 2015 Jul 16.
3
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
单细胞 RNA 测序平台和方法的比较分析。
J Biomol Tech. 2021 Dec 15;32(4). doi: 10.7171/3fc1f5fe.3eccea01.
4
Anti-bias training for (sc)RNA-seq: experimental and computational approaches to improve precision.针对 (sc)RNA-seq 的反偏差训练:提高精度的实验和计算方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab148.
5
Clinical and Host Biological Factors Predict Colectomy Risk in Children Newly Diagnosed With Ulcerative Colitis.临床和宿主生物学因素预测溃疡性结肠炎初诊患儿的结肠切除术风险。
Inflamm Bowel Dis. 2022 Feb 1;28(2):151-160. doi: 10.1093/ibd/izab061.
6
Regulation of gene expression in the bovine blastocyst by colony-stimulating factor 2 is disrupted by CRISPR/Cas9-mediated deletion of CSF2RA.CRISPR/Cas9 介导的 CSF2RA 缺失破坏了牛囊胚中集落刺激因子 2 对基因表达的调控。
Biol Reprod. 2021 May 7;104(5):995-1007. doi: 10.1093/biolre/ioab015.
7
Consistent RNA sequencing contamination in GTEx and other data sets.GTEx 及其他数据集存在一致的 RNA 测序污染。
Nat Commun. 2020 Apr 22;11(1):1933. doi: 10.1038/s41467-020-15821-9.
8
Defining the Celiac Disease Transcriptome using Clinical Pathology Specimens Reveals Biologic Pathways and Supports Diagnosis.使用临床病理标本定义乳糜泻转录组可揭示生物学途径并支持诊断。
Sci Rep. 2019 Nov 7;9(1):16163. doi: 10.1038/s41598-019-52733-1.
9
Modelling RNA-Seq data with a zero-inflated mixture Poisson linear model.用零膨胀混合泊松线性模型对 RNA-Seq 数据进行建模。
Genet Epidemiol. 2019 Oct;43(7):786-799. doi: 10.1002/gepi.22246. Epub 2019 Jul 22.
10
Clinical and biological predictors of response to standardised paediatric colitis therapy (PROTECT): a multicentre inception cohort study.标准化小儿结肠炎治疗反应的临床和生物学预测因子(PROTECT):一项多中心发病队列研究。
Lancet. 2019 Apr 27;393(10182):1708-1720. doi: 10.1016/S0140-6736(18)32592-3. Epub 2019 Mar 29.
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
4
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.测序质量控制联盟对RNA测序准确性、可重复性和信息含量的全面评估。
Nat Biotechnol. 2014 Sep;32(9):903-14. doi: 10.1038/nbt.2957. Epub 2014 Aug 24.
5
PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.PennSeq:通过建模非均匀读取分布实现 RNA-Seq 中精确的异构体特异性基因表达定量。
Nucleic Acids Res. 2014 Feb;42(3):e20. doi: 10.1093/nar/gkt1304. Epub 2013 Dec 20.
6
DEXUS: identifying differential expression in RNA-Seq studies with unknown conditions.DEXUS:在未知条件的 RNA-Seq 研究中识别差异表达。
Nucleic Acids Res. 2013 Nov;41(21):e198. doi: 10.1093/nar/gkt834. Epub 2013 Sep 17.
7
Human housekeeping genes, revisited.人类管家基因,再探。
Trends Genet. 2013 Oct;29(10):569-74. doi: 10.1016/j.tig.2013.05.010. Epub 2013 Jun 27.
8
Streaming fragment assignment for real-time analysis of sequencing experiments.实时分析测序实验的流片段分配。
Nat Methods. 2013 Jan;10(1):71-3. doi: 10.1038/nmeth.2251. Epub 2012 Nov 18.
9
Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads.从偏向性 RNA-Seq 读段进行转录组组装和异构体表达水平估计。
Bioinformatics. 2012 Nov 15;28(22):2914-21. doi: 10.1093/bioinformatics/bts559. Epub 2012 Oct 11.
10
A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data.一种新的用于散布的收缩估计量可改善 RNA-seq 数据中的差异表达检测。
Biostatistics. 2013 Apr;14(2):232-43. doi: 10.1093/biostatistics/kxs033. Epub 2012 Sep 22.