短读质量和数量对脊椎动物从头转录组组装的影响。

Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.

机构信息

Department of Chemistry and Biochemistry, 419 Centennial Hall, Texas State University, 601 University Drive, San Marcos, TX 78666, USA.

出版信息

Comp Biochem Physiol C Toxicol Pharmacol. 2012 Jan;155(1):95-101. doi: 10.1016/j.cbpc.2011.05.012. Epub 2011 Jun 1.

DOI:10.1016/j.cbpc.2011.05.012

PMID:21651990

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3223268/

Abstract

For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publications describe the overall process of sequencing and assembly, few address the topic of how many and what types of reads are best for assembly. The goal of this project was use real world data to explore the effects of read quantity and short read quality scores on the resulting de novo assemblies. Using several samples of short reads of various sizes and qualities we produced many assemblies in an automated manner. We observe how the properties of read length, read quality, and read quantity affect the resulting assemblies and provide some general recommendations based on our real-world data set.

摘要

对于许多研究人员来说，下一代测序数据是回答以前无法解决的一类问题的关键。在实现这些目标的重要且具有挑战性的步骤之一是将大量短测序读段准确地组装成完整的核酸序列。对于使用非模型或野生系统的研究小组，由于缺乏预先存在的 EST 或基因组参考文库，短读段组装可能会带来重大挑战。虽然许多出版物都描述了测序和组装的整个过程，但很少有出版物涉及到最佳组装所需的读取数量和读取质量评分的类型。本项目的目标是使用实际数据来探索读取数量和短读取质量评分对生成的从头组装的影响。我们使用各种大小和质量的短读取的几个样本以自动化的方式生成了许多组装。我们观察读取长度、读取质量和读取数量的属性如何影响生成的组装，并根据我们的实际数据集提供一些一般建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/283f/3223268/3c2073f7e3d8/nihms320384f1.jpg

相似文献

Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.短读质量和数量对脊椎动物从头转录组组装的影响。

Comp Biochem Physiol C Toxicol Pharmacol. 2012 Jan;155(1):95-101. doi: 10.1016/j.cbpc.2011.05.012. Epub 2011 Jun 1.

Comparative performance of transcriptome assembly methods for non-model organisms.非模式生物转录组组装方法的比较性能

BMC Genomics. 2016 Jul 27;17:523. doi: 10.1186/s12864-016-2923-8.

Evaluating characteristics of de novo assembly software on 454 transcriptome data: a simulation approach.基于 454 转录组数据评估从头组装软件的特性：一种模拟方法。

PLoS One. 2012;7(2):e31410. doi: 10.1371/journal.pone.0031410. Epub 2012 Feb 27.

Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance.Illumina 短读测序数据用于从头组装非模式蜗牛物种转录组（Radix balthica，Basommatophora，Pulmonata），并比较组装器性能。

BMC Genomics. 2011 Jun 16;12:317. doi: 10.1186/1471-2164-12-317.

TransRate: reference-free quality assessment of de novo transcriptome assemblies.TransRate：从头转录组组装的无参考质量评估

Genome Res. 2016 Aug;26(8):1134-44. doi: 10.1101/gr.196469.115. Epub 2016 Jun 1.

Challenges and advances for transcriptome assembly in non-model species.非模式物种转录组组装面临的挑战与进展

PLoS One. 2017 Sep 20;12(9):e0185020. doi: 10.1371/journal.pone.0185020. eCollection 2017.

Software for pre-processing Illumina next-generation sequencing short read sequences.用于预处理Illumina下一代测序短读序列的软件。

Source Code Biol Med. 2014 May 3;9:8. doi: 10.1186/1751-0473-9-8. eCollection 2014.

Positional bias in variant calls against draft reference assemblies.针对草图参考基因组组装的变异位点调用中的位置偏差。

BMC Genomics. 2017 Mar 28;18(1):263. doi: 10.1186/s12864-017-3637-2.

Highly accurate long reads are crucial for realizing the potential of biodiversity genomics.高质量的长读长序列对于实现生物多样性基因组学的潜力至关重要。

BMC Genomics. 2023 Mar 16;24(1):117. doi: 10.1186/s12864-023-09193-9.

Identifying wrong assemblies in de novo short read primary sequence assembly contigs.在从头短读长初级序列组装重叠群中识别错误的组装。

J Biosci. 2016 Sep;41(3):455-74. doi: 10.1007/s12038-016-9630-0.

引用本文的文献

HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly.HiFiAdapterFilt 是一种节省内存的读处理流水线，可以防止 PacBio HiFi 读中出现接头序列，并降低接头序列对基因组组装的负面影响。

BMC Genomics. 2022 Feb 22;23(1):157. doi: 10.1186/s12864-022-08375-1.

A simple guide to de novo transcriptome assembly and annotation.从头转录组组装与注释简明指南。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab563.

Global assessment of organ specific basal gene expression over a diurnal cycle with analyses of gene copies exhibiting cyclic expression patterns.通过对呈现循环表达模式的基因拷贝进行分析，对昼夜周期内器官特异性基础基因表达进行整体评估。

BMC Genomics. 2020 Nov 11;21(1):787. doi: 10.1186/s12864-020-07202-9.

Oncogenic allelic interaction in highlights hybrid incompatibility.癌基因等位基因相互作用凸显杂种不亲和性。

Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29786-29794. doi: 10.1073/pnas.2010133117. Epub 2020 Nov 9.

Intra-Strain Genetic Variation of Platyfish () Strains Determines Tumorigenic Trajectory.剑尾鱼（）品系的品系内遗传变异决定肿瘤发生轨迹。

Front Genet. 2020 Oct 6;11:562594. doi: 10.3389/fgene.2020.562594. eCollection 2020.

Application of the Transcriptional Disease Signature (TDSs) to Screen Melanoma-Effective Compounds in a Small Fish Model.转录疾病特征（TDSs）在小鱼模型中筛选黑素瘤有效化合物的应用。

Sci Rep. 2019 Jan 24;9(1):530. doi: 10.1038/s41598-018-36656-x.

Comparison of Xiphophorus and human melanoma transcriptomes reveals conserved pathway interactions.比较剑尾鱼和人类黑色素瘤转录组揭示了保守的通路相互作用。

Pigment Cell Melanoma Res. 2018 Jul;31(4):496-508. doi: 10.1111/pcmr.12686. Epub 2018 Jan 29.

Expression signatures of early-stage and advanced medaka melanomas.早期和晚期斑马鱼黑色素瘤的表达特征。

Comp Biochem Physiol C Toxicol Pharmacol. 2018 Jun;208:20-28. doi: 10.1016/j.cbpc.2017.11.005. Epub 2017 Nov 21.

The transcriptional response of skin to fluorescent light exposure in viviparous (Xiphophorus) and oviparous (Danio, Oryzias) fishes.活体（剑尾鱼）和卵生（斑马鱼、青鳉）鱼类皮肤对荧光灯暴露的转录反应。

Comp Biochem Physiol C Toxicol Pharmacol. 2018 Jun;208:77-86. doi: 10.1016/j.cbpc.2017.10.003. Epub 2017 Oct 7.

Fluorescent light exposure incites acute and prolonged immune responses in zebrafish (Danio rerio) skin.荧光灯暴露会引起斑马鱼（Danio rerio）皮肤的急性和慢性免疫反应。

Comp Biochem Physiol C Toxicol Pharmacol. 2018 Jun;208:87-95. doi: 10.1016/j.cbpc.2017.09.009. Epub 2017 Sep 29.

本文引用的文献

Comparing de novo genome assembly: the long and short of it.从头开始比较基因组组装：长与短。

PLoS One. 2011 Apr 29;6(4):e19175. doi: 10.1371/journal.pone.0019175.

Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies.评估使用 Mate-Pairs 解决从头组装的短读 prokaryotic 重复的好处。

BMC Bioinformatics. 2011 Apr 13;12:95. doi: 10.1186/1471-2105-12-95.

De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification.利用短读长进行 chickpea 转录组从头组装，以进行基因发现和标记鉴定。

DNA Res. 2011 Feb;18(1):53-63. doi: 10.1093/dnares/dsq028. Epub 2011 Jan 7.

What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research.下一代测序技术能为您做什么？下一代测序技术作为植物研究中的一种有价值的工具。

Plant Biol (Stuttg). 2010 Nov;12(6):831-41. doi: 10.1111/j.1438-8677.2010.00373.x.

EDAR: an efficient error detection and removal algorithm for next generation sequencing data.EDAR：一种用于下一代测序数据的高效错误检测与去除算法。

J Comput Biol. 2010 Nov;17(11):1549-60. doi: 10.1089/cmb.2010.0127. Epub 2010 Oct 25.

Using the Velvet de novo assembler for short-read sequencing technologies.将Velvet从头组装程序用于短读长测序技术。

Curr Protoc Bioinformatics. 2010 Sep;Chapter 11:Unit 11.5. doi: 10.1002/0471250953.bi1105s31.

De novo assembly of short sequence reads.从头组装短序列读段。

Brief Bioinform. 2010 Sep;11(5):457-72. doi: 10.1093/bib/bbq020. Epub 2010 Aug 19.

Optimization of de novo transcriptome assembly from next-generation sequencing data.从头转录组组装的优化。

Genome Res. 2010 Oct;20(10):1432-40. doi: 10.1101/gr.103846.109. Epub 2010 Aug 6.

Uncovering the complexity of transcriptomes with RNA-Seq.利用RNA测序揭示转录组的复杂性。

J Biomed Biotechnol. 2010;2010:853916. doi: 10.1155/2010/853916. Epub 2010 Jun 27.

Next-generation sequencing techniques for eukaryotic microorganisms: sequencing-based solutions to biological problems.真核微生物的新一代测序技术：基于测序解决生物学问题的方法

Eukaryot Cell. 2010 Sep;9(9):1300-10. doi: 10.1128/EC.00123-10. Epub 2010 Jul 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验