• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用高斯自基准框架增强 RNA-seq 偏置缓解:实现无偏测序数据。

Enhancing RNA-seq bias mitigation with the Gaussian self-benchmarking framework: towards unbiased sequencing data.

机构信息

Faculty of Synthetic Biology, Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Shenzhen University of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.

State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, China.

出版信息

BMC Genomics. 2024 Sep 30;25(1):904. doi: 10.1186/s12864-024-10814-0.

DOI:10.1186/s12864-024-10814-0
PMID:39350040
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11441123/
Abstract

BACKGROUND

RNA sequencing is a vital technique for analyzing RNA behavior in cells, but it often suffers from various biases that distort the data. Traditional methods to address these biases are typically empirical and handle them individually, limiting their effectiveness. Our study introduces the Gaussian Self-Benchmarking (GSB) framework, a novel approach that leverages the natural distribution patterns of guanine (G) and cytosine (C) content in RNA to mitigate multiple biases simultaneously. This method is grounded in a theoretical model, organizing k-mers based on their GC content and applying a Gaussian model for alignment to ensure empirical sequencing data closely match their theoretical distribution.

RESULTS

The GSB framework demonstrated superior performance in mitigating sequencing biases compared to existing methods. Testing with synthetic RNA constructs and real human samples showed that the GSB approach not only addresses individual biases more effectively but also manages co-existing biases jointly. The framework's reliance on accurately pre-determined parameters like mean and standard deviation of GC content distribution allows for a more precise representation of RNA samples. This results in improved accuracy and reliability of RNA sequencing data, enhancing our understanding of RNA behavior in health and disease.

CONCLUSIONS

The GSB framework presents a significant advancement in RNA sequencing analysis by providing a well-validated, multi-bias mitigation strategy. It functions independently from previously identified dataset flaws and sets a new standard for unbiased RNA sequencing results. This development enhances the reliability of RNA studies, broadening the potential for scientific breakthroughs in medicine and biology, particularly in genetic disease research and the development of targeted treatments.

摘要

背景

RNA 测序是分析细胞中 RNA 行为的重要技术,但它经常受到各种偏倚的影响,这些偏倚会扭曲数据。传统的方法通常是经验性的,并且单独处理这些偏倚,限制了它们的效果。我们的研究引入了高斯自基准(GSB)框架,这是一种利用 RNA 中鸟嘌呤(G)和胞嘧啶(C)含量的自然分布模式来同时减轻多种偏倚的新方法。该方法基于理论模型,根据其 GC 含量组织 k-mer,并应用高斯模型进行比对,以确保经验测序数据与其理论分布紧密匹配。

结果

GSB 框架在减轻测序偏倚方面表现出优于现有方法的性能。使用合成 RNA 构建体和真实人类样本进行测试表明,GSB 方法不仅更有效地解决了单个偏倚,而且还联合管理了共存的偏倚。该框架对 GC 含量分布的均值和标准差等准确预定义参数的依赖,允许更精确地表示 RNA 样本。这导致 RNA 测序数据的准确性和可靠性得到提高,从而增强了我们对健康和疾病中 RNA 行为的理解。

结论

GSB 框架通过提供经过良好验证的多偏倚缓解策略,在 RNA 测序分析方面取得了重大进展。它独立于先前确定的数据集缺陷运行,并为无偏 RNA 测序结果设定了新标准。这一发展提高了 RNA 研究的可靠性,拓宽了医学和生物学领域科学突破的潜力,特别是在遗传疾病研究和靶向治疗的发展方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/ae4e41d30f18/12864_2024_10814_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/c4a7085514df/12864_2024_10814_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/2ef0b774992e/12864_2024_10814_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/6ebbf4fcff0f/12864_2024_10814_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/834f19574512/12864_2024_10814_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/97a7b62abd39/12864_2024_10814_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/ae4e41d30f18/12864_2024_10814_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/c4a7085514df/12864_2024_10814_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/2ef0b774992e/12864_2024_10814_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/6ebbf4fcff0f/12864_2024_10814_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/834f19574512/12864_2024_10814_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/97a7b62abd39/12864_2024_10814_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abe2/11441123/ae4e41d30f18/12864_2024_10814_Fig6_HTML.jpg

相似文献

1
Enhancing RNA-seq bias mitigation with the Gaussian self-benchmarking framework: towards unbiased sequencing data.利用高斯自基准框架增强 RNA-seq 偏置缓解:实现无偏测序数据。
BMC Genomics. 2024 Sep 30;25(1):904. doi: 10.1186/s12864-024-10814-0.
2
Enhancing RNA-seq analysis by addressing all co-existing biases using a self-benchmarking approach with 2D structural insights.采用二维结构见解的自我基准测试方法解决所有共存偏差,从而增强 RNA-seq 分析。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae532.
3
Critical view on oligo(dT)-based RNA-seq: bias arising, modeling, and mitigating.寡聚(dT)- 基于 RNA-seq 的批判性观点:偏差的出现、建模和缓解。
Genetics. 2024 Mar 6;226(3). doi: 10.1093/genetics/iyad190.
4
Reducing bias in RNA sequencing data: a novel approach to compute counts.降低 RNA 测序数据中的偏差:一种计算计数的新方法。
BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-15-S1-S7. Epub 2014 Jan 10.
5
Bias detection and correction in RNA-Sequencing data.RNA 测序数据中的偏差检测和校正。
BMC Bioinformatics. 2011 Jul 19;12:290. doi: 10.1186/1471-2105-12-290.
6
Single-cell analysis via manifold fitting: A framework for RNA clustering and beyond.单细胞分析通过流形拟合:RNA 聚类及其他。
Proc Natl Acad Sci U S A. 2024 Sep 10;121(37):e2400002121. doi: 10.1073/pnas.2400002121. Epub 2024 Sep 3.
7
GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms.GC 偏倚影响基因组和宏基因组的重建,使 GC 含量低的生物代表性不足。
Gigascience. 2020 Feb 1;9(2). doi: 10.1093/gigascience/giaa008.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Low-Bias RNA Sequencing of the HIV-2 Genome from Blood Plasma.从血浆中对 HIV-2 基因组进行低偏差 RNA 测序。
J Virol. 2018 Dec 10;93(1). doi: 10.1128/JVI.00677-18. Print 2019 Jan 1.
10
Sequencing on the SOLiD 5500xl System - in-depth characterization of the GC bias.SOLiD 5500xl 系统测序 - GC 偏倚的深入特征分析。
Nucleus. 2017 Jul 4;8(4):370-380. doi: 10.1080/19491034.2017.1320461. Epub 2017 Apr 27.

本文引用的文献

1
Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing.采用准确的长读测序技术获取完整的基因组、转录组和表观基因组。
Nat Methods. 2023 Jan;20(1):12-16. doi: 10.1038/s41592-022-01716-8.
2
Transcriptome variation in human tissues revealed by long-read sequencing.长读测序揭示人类组织中的转录组变异。
Nature. 2022 Aug;608(7922):353-359. doi: 10.1038/s41586-022-05035-y. Epub 2022 Aug 3.
3
Nanopore sequencing technology, bioinformatics and applications.纳米孔测序技术、生物信息学及其应用。
Nat Biotechnol. 2021 Nov;39(11):1348-1365. doi: 10.1038/s41587-021-01108-x. Epub 2021 Nov 8.
4
Yanagi: Fast and interpretable segment-based alternative splicing and gene expression analysis.柳:快速且可解释的基于片段的剪接变体和基因表达分析。
BMC Bioinformatics. 2019 Aug 13;20(1):421. doi: 10.1186/s12859-019-2947-6.
5
Alternating EM algorithm for a bilinear model in isoform quantification from RNA-seq data.从 RNA-seq 数据中定量异构体的双线性模型的交替 EM 算法。
Bioinformatics. 2020 Feb 1;36(3):805-812. doi: 10.1093/bioinformatics/btz640.
6
RNA sequencing: the teenage years.RNA 测序:青少年时期。
Nat Rev Genet. 2019 Nov;20(11):631-656. doi: 10.1038/s41576-019-0150-2. Epub 2019 Jul 24.
7
Salmon provides fast and bias-aware quantification of transcript expression.鲑鱼提供快速且无偏倚的转录本表达定量。
Nat Methods. 2017 Apr;14(4):417-419. doi: 10.1038/nmeth.4197. Epub 2017 Mar 6.
8
Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation.RNA测序片段序列偏差的建模可减少转录本丰度估计中的系统误差。
Nat Biotechnol. 2016 Dec;34(12):1287-1291. doi: 10.1038/nbt.3682. Epub 2016 Sep 26.
9
Near-optimal probabilistic RNA-seq quantification.近乎最优的概率 RNA-seq 定量。
Nat Biotechnol. 2016 May;34(5):525-7. doi: 10.1038/nbt.3519. Epub 2016 Apr 4.
10
A survey of best practices for RNA-seq data analysis.RNA测序数据分析的最佳实践调查。
Genome Biol. 2016 Jan 26;17:13. doi: 10.1186/s13059-016-0881-8.