• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

技术和生物学变异对RNA-Seq数据集从头组装的影响的综合分析

Comprehensive Analysis of the Influence of Technical and Biological Variations on De Novo Assembly of RNA-Seq Datasets.

作者信息

Sergio Alberto Gonzalez, Maximo Rivarola, Andres Ribone, Sergio Lew, Norma Paniego

机构信息

Instituto de Agrobiotecnología y Biología Molecular (IABIMO), CICVyA, Instituto Nacional de Tecnología Agropecuaria (INTA), Buenos Aires, Argentina.

Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina.

出版信息

Bioinform Biol Insights. 2024 Dec 5;18:11779322241274957. doi: 10.1177/11779322241274957. eCollection 2024.

DOI:10.1177/11779322241274957
PMID:39649541
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11622296/
Abstract

De novo assembly of transcriptomes from species without reference genome remains a common problem in functional genomics. While methods and algorithms for transcriptome assembly are continually being developed and published, the quality of de novo assemblies using short reads depends on the complexity of the transcriptome and is limited by several types of errors. One problem to overcome is the research gap regarding the best method to use in each study to obtain high-quality de novo assembly. Currently, there are no established protocols for solving the assembly problem considering the transcriptome complexity. In addition, the accuracy of quality metrics used to evaluate assemblies remains unclear. In this study, we investigate and discuss how different variables accounting for the complexity of RNA-Seq data influence assembly results independently of the software used. For this purpose, we simulated transcriptomic short-read sequence datasets from high-quality full-length predicted transcript models with varying degrees of complexity. Subsequently, we conducted de novo assemblies using different assembly programs, and compared and classified the results using both reference-dependent and independent metrics. These metrics were assessed both individually and combined through multivariate analysis. The degree of alternative splicing and the fragment size of the paired-end reads were identified as the variables with the greatest influence on the assembly results. Moreover, read length and fragment size had different influences on the reconstruction of longer and shorter transcripts. These results underscore the importance of understanding the composition of the transcriptome under study, and making experimental design decisions related to the need to work with reads and fragments of different sizes. In addition, the choice of assembly software will positively impact the final assembly outcome. This selection will affect the completeness of represented genes and assembled isoforms, as well as contribute to error reduction.

摘要

对于没有参考基因组的物种,从头组装转录组仍然是功能基因组学中的一个常见问题。虽然转录组组装的方法和算法不断得到开发和发表,但使用短读长进行从头组装的质量取决于转录组的复杂性,并受到几种类型错误的限制。需要克服的一个问题是,在每项研究中使用何种最佳方法来获得高质量的从头组装,这方面存在研究差距。目前,尚无考虑转录组复杂性来解决组装问题的既定方案。此外,用于评估组装的质量指标的准确性仍不明确。在本研究中,我们调查并讨论了不同的、反映RNA测序数据复杂性的变量如何独立于所使用的软件而影响组装结果。为此,我们从具有不同复杂程度的高质量全长预测转录本模型中模拟了转录组短读长序列数据集。随后,我们使用不同的组装程序进行从头组装,并使用依赖参考和独立于参考的指标对结果进行比较和分类。这些指标既单独评估,也通过多变量分析进行综合评估。可变剪接程度和双端读长的片段大小被确定为对组装结果影响最大的变量。此外,读长和片段大小对长转录本和短转录本的重建有不同影响。这些结果强调了了解所研究转录组组成的重要性,以及根据处理不同大小读长和片段的需求做出实验设计决策的重要性。此外,组装软件的选择将对最终的组装结果产生积极影响。这种选择将影响所代表基因和组装异构体的完整性,并有助于减少错误。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/f567a71290c0/10.1177_11779322241274957-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/1cf9f6b15bfd/10.1177_11779322241274957-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/a264084c7df8/10.1177_11779322241274957-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/fb11560eb609/10.1177_11779322241274957-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/b2a157b8736d/10.1177_11779322241274957-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/f567a71290c0/10.1177_11779322241274957-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/1cf9f6b15bfd/10.1177_11779322241274957-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/a264084c7df8/10.1177_11779322241274957-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/fb11560eb609/10.1177_11779322241274957-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/b2a157b8736d/10.1177_11779322241274957-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b0/11622296/f567a71290c0/10.1177_11779322241274957-fig5.jpg

相似文献

1
Comprehensive Analysis of the Influence of Technical and Biological Variations on De Novo Assembly of RNA-Seq Datasets.技术和生物学变异对RNA-Seq数据集从头组装的影响的综合分析
Bioinform Biol Insights. 2024 Dec 5;18:11779322241274957. doi: 10.1177/11779322241274957. eCollection 2024.
2
Extending rnaSPAdes functionality for hybrid transcriptome assembly.扩展 rnaSPAdes 功能以进行混合转录组组装。
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):302. doi: 10.1186/s12859-020-03614-2.
3
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.ClusTrast:一种基于聚类 contigs 的短读从头转录本异构体组装工具。
BMC Bioinformatics. 2024 Feb 1;25(1):54. doi: 10.1186/s12859-024-05663-3.
4
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.利用全长异构体测序和短读长测序的从头组装对高度多倍体甘蔗基因组的复杂转录组进行的一项调查。
BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.
5
Challenges and advances for transcriptome assembly in non-model species.非模式物种转录组组装面临的挑战与进展
PLoS One. 2017 Sep 20;12(9):e0185020. doi: 10.1371/journal.pone.0185020. eCollection 2017.
6
De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.从头转录组组装:短读 RNA-Seq 组装器的全面跨物种比较。
Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz039.
7
Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads.在短RNA测序读数的局部和全局从头转录组组装中与重复序列玩捉迷藏游戏。
Algorithms Mol Biol. 2017 Feb 22;12:2. doi: 10.1186/s13015-017-0091-2. eCollection 2017.
8
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
9
rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data.rnaSPAdes:一种从头转录组组装程序及其在 RNA-Seq 数据中的应用。
Gigascience. 2019 Sep 1;8(9). doi: 10.1093/gigascience/giz100.
10
A Full-Length mRNA Transcriptome Generated From Hybrid-Corrected PacBio Long-Reads Improves the Transcript Annotation and Identifies Thousands of Novel Splice Variants in Atlantic Salmon.通过混合校正的PacBio长读长生成的全长mRNA转录组改善了转录本注释并鉴定了大西洋鲑鱼中数千种新的剪接变体。
Front Genet. 2021 Apr 27;12:656334. doi: 10.3389/fgene.2021.656334. eCollection 2021.

本文引用的文献

1
transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation.transXpress:用于简化从头转录组组装和注释的 SnakeMake 管道。
BMC Bioinformatics. 2023 Apr 4;24(1):133. doi: 10.1186/s12859-023-05254-8.
2
TransPi-a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly.TransPi-一个用于从头组装转录组的全面转录组分析管道。
Mol Ecol Resour. 2022 Jul;22(5):2070-2086. doi: 10.1111/1755-0998.13593. Epub 2022 Feb 18.
3
The importance of alternative splicing in adaptive evolution.
可变剪接在适应性进化中的重要性。
Mol Ecol. 2022 Apr;31(7):1928-1938. doi: 10.1111/mec.16377. Epub 2022 Feb 17.
4
A simple guide to de novo transcriptome assembly and annotation.从头转录组组装与注释简明指南。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab563.
5
Error, noise and bias in de novo transcriptome assemblies.从头转录组组装中的错误、噪声和偏差。
Mol Ecol Resour. 2021 Jan;21(1):18-29. doi: 10.1111/1755-0998.13156. Epub 2020 Apr 13.
6
rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data.rnaSPAdes:一种从头转录组组装程序及其在 RNA-Seq 数据中的应用。
Gigascience. 2019 Sep 1;8(9). doi: 10.1093/gigascience/giz100.
7
Alternative Splicing and Protein Diversity: Plants Versus Animals.可变剪接与蛋白质多样性:植物与动物
Front Plant Sci. 2019 Jun 12;10:708. doi: 10.3389/fpls.2019.00708. eCollection 2019.
8
Effect of de novo transcriptome assembly on transcript quantification.从头转录组组装对转录本定量的影响。
Sci Rep. 2019 Jun 5;9(1):8304. doi: 10.1038/s41598-019-44499-3.
9
De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.从头转录组组装:短读 RNA-Seq 组装器的全面跨物种比较。
Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz039.
10
Overview of Next-Generation Sequencing Technologies.新一代测序技术概述
Curr Protoc Mol Biol. 2018 Apr;122(1):e59. doi: 10.1002/cpmb.59.