• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

小RNA测序随机过程中的偏差和变异建模

Modeling bias and variation in the stochastic processes of small RNA sequencing.

作者信息

Argyropoulos Christos, Etheridge Alton, Sakhanenko Nikita, Galas David

机构信息

Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM 87106, USA.

Pacific Northwest Research Institute, Seattle, WA 98122, USA.

出版信息

Nucleic Acids Res. 2017 Jun 20;45(11):e104. doi: 10.1093/nar/gkx199.

DOI:10.1093/nar/gkx199
PMID:28369495
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5499834/
Abstract

The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data.

摘要

RNA测序作为发现和验证小RNA生物标志物的首选方法,一直受到高定量变异性和序列计数偏差的阻碍。在本文中,我们开发了一种针对序列计数的统计模型,该模型考虑了连接酶偏差和序列计数中的随机变异。该模型意味着序列计数的均值和方差之间存在线性二次关系。使用大量测序数据集,我们展示了如何使用位置、尺度和形状的广义相加模型(GAMLSS)分布回归框架来计算和应用连接酶偏差的经验校正因子。偏差校正可以消除超过40%的miRNA偏差。经验偏差校正因子在至少一个到四个数量级的总RNA输入范围内似乎几乎恒定,并且与样品组成无关。使用已知组成的合成混合物,我们表明,对于小RNA测序数据分析,GAMLSS方法比六种现有算法(DESeq2、edgeR、EBSeq、limma、DSS、voom)能够更准确、更灵敏且更具特异性地分析差异表达。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/1c7e1b96f206/gkx199fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/f1b5ef58aaae/gkx199fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/f9692cdf9118/gkx199fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/633c440d5037/gkx199fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/e7bca6f9c81d/gkx199fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/9a6acf7fbafa/gkx199fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/cbba75fa01e9/gkx199fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/cb8fb9cfb2fd/gkx199fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/1c7e1b96f206/gkx199fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/f1b5ef58aaae/gkx199fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/f9692cdf9118/gkx199fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/633c440d5037/gkx199fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/e7bca6f9c81d/gkx199fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/9a6acf7fbafa/gkx199fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/cbba75fa01e9/gkx199fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/cb8fb9cfb2fd/gkx199fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820b/5499834/1c7e1b96f206/gkx199fig8.jpg

相似文献

1
Modeling bias and variation in the stochastic processes of small RNA sequencing.小RNA测序随机过程中的偏差和变异建模
Nucleic Acids Res. 2017 Jun 20;45(11):e104. doi: 10.1093/nar/gkx199.
2
Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2.三种 RNA 测序差异表达分析方法:limma、EdgeR、DESeq2。
J Vis Exp. 2021 Sep 18(175). doi: 10.3791/62528.
3
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.
4
BALLI: Bartlett-adjusted likelihood-based linear model approach for identifying differentially expressed genes with RNA-seq data.BALLI:基于 Bartlett 调整似然比的线性模型方法,用于鉴定 RNA-seq 数据中差异表达的基因。
BMC Genomics. 2019 Jul 2;20(1):540. doi: 10.1186/s12864-019-5851-6.
5
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
6
No counts, no variance: allowing for loss of degrees of freedom when assessing biological variability from RNA-seq data.无计数,无方差:评估RNA测序数据的生物学变异性时考虑自由度损失。
Stat Appl Genet Mol Biol. 2017 Apr 25;16(2):83-93. doi: 10.1515/sagmb-2017-0010.
7
Modelling RNA-Seq data with a zero-inflated mixture Poisson linear model.用零膨胀混合泊松线性模型对 RNA-Seq 数据进行建模。
Genet Epidemiol. 2019 Oct;43(7):786-799. doi: 10.1002/gepi.22246. Epub 2019 Jul 22.
8
Benchmarking association analyses of continuous exposures with RNA-seq in observational studies.基于 RNA-seq 的观察性研究中连续暴露关联分析的基准测试。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab194.
9
MLSeq: Machine learning interface for RNA-sequencing data.MLSeq:用于 RNA-seq 数据的机器学习接口。
Comput Methods Programs Biomed. 2019 Jul;175:223-231. doi: 10.1016/j.cmpb.2019.04.007. Epub 2019 Apr 29.
10
Robustness of differential gene expression analysis of RNA-seq.RNA测序差异基因表达分析的稳健性
Comput Struct Biotechnol J. 2021 May 26;19:3470-3481. doi: 10.1016/j.csbj.2021.05.040. eCollection 2021.

引用本文的文献

1
Function and regulation of miR-186-5p, miR-125b-5p and miR-1260a in chordoma.miR-186-5p、miR-125b-5p 和 miR-1260a 在软骨肉瘤中的功能和调控
BMC Cancer. 2023 Nov 27;23(1):1152. doi: 10.1186/s12885-023-11238-x.
2
A heteroskedastic model of Park Grass spring hay yields in response to weather suggests continuing yield decline with climate change in future decades.泊松分布异方差模型表明,未来几十年,随着气候变化,春干草产量将继续下降。
J R Soc Interface. 2022 Aug;19(193):20220361. doi: 10.1098/rsif.2022.0361. Epub 2022 Aug 24.
3
Analysis and correction of compositional bias in sparse sequencing count data.

本文引用的文献

1
Empirical insights into the stochasticity of small RNA sequencing.对小RNA测序随机性的实证洞察。
Sci Rep. 2016 Apr 7;6:24061. doi: 10.1038/srep24061.
2
Translating RNA sequencing into clinical diagnostics: opportunities and challenges.将RNA测序转化为临床诊断:机遇与挑战。
Nat Rev Genet. 2016 May;17(5):257-71. doi: 10.1038/nrg.2016.10. Epub 2016 Mar 21.
3
Addressing Bias in Small RNA Library Preparation for Sequencing: A New Protocol Recovers MicroRNAs that Evade Capture by Current Methods.解决用于测序的小RNA文库制备中的偏差问题:一种新方案可回收目前方法无法捕获的微小RNA。
稀疏测序计数数据中组成偏差的分析与校正。
BMC Genomics. 2018 Nov 6;19(1):799. doi: 10.1186/s12864-018-5160-5.
4
Role of MicroRNAs in Renal Parenchymal Diseases-A New Dimension.微小 RNA 在肾实质疾病中的作用——新视角。
Int J Mol Sci. 2018 Jun 17;19(6):1797. doi: 10.3390/ijms19061797.
5
Study protocol: rationale and design of the community-based prospective cohort study of kidney function and diabetes in rural New Mexico, the COMPASS study.研究方案:新墨西哥州农村地区肾脏功能与糖尿病社区前瞻性队列研究(COMPASS研究)的原理与设计
BMC Nephrol. 2018 Feb 27;19(1):47. doi: 10.1186/s12882-018-0842-4.
6
Fingerprints of Modified RNA Bases from Deep Sequencing Profiles.从深度测序图谱中提取修饰 RNA 碱基的特征。
J Am Chem Soc. 2017 Nov 29;139(47):17074-17081. doi: 10.1021/jacs.7b07914. Epub 2017 Nov 17.
Front Genet. 2015 Dec 22;6:352. doi: 10.3389/fgene.2015.00352. eCollection 2015.
4
Computational analysis of stochastic heterogeneity in PCR amplification efficiency revealed by single molecule barcoding.通过单分子条形码揭示的聚合酶链式反应(PCR)扩增效率中随机异质性的计算分析
Sci Rep. 2015 Oct 13;5:14629. doi: 10.1038/srep14629.
5
Sources of PCR-induced distortions in high-throughput sequencing data sets.高通量测序数据集中PCR诱导偏差的来源。
Nucleic Acids Res. 2015 Dec 2;43(21):e143. doi: 10.1093/nar/gkv717. Epub 2015 Jul 17.
6
Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors.超越功效计算:评估 S 型(信号)和 M 型(幅度)误差。
Perspect Psychol Sci. 2014 Nov;9(6):641-51. doi: 10.1177/1745691614551642.
7
Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure.基于连接的小RNA测序文库构建中的偏差由接头和RNA结构决定。
PLoS One. 2015 May 5;10(5):e0126049. doi: 10.1371/journal.pone.0126049. eCollection 2015.
8
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.为什么是权重?对样本和观测水平的变异性进行建模可提高RNA测序分析的效能。
Nucleic Acids Res. 2015 Sep 3;43(15):e97. doi: 10.1093/nar/gkv412. Epub 2015 Apr 29.
9
limma powers differential expression analyses for RNA-sequencing and microarray studies.limma为RNA测序和微阵列研究提供差异表达分析的动力。
Nucleic Acids Res. 2015 Apr 20;43(7):e47. doi: 10.1093/nar/gkv007. Epub 2015 Jan 20.
10
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.