Suppr超能文献

短读长序列比对工具的比较显示了生物学家需要考虑的优势和劣势。

Comparison of Short-Read Sequence Aligners Indicates Strengths and Weaknesses for Biologists to Consider.

作者信息

Musich Ryan, Cadle-Davidson Lance, Osier Michael V

机构信息

Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester, NY, United States.

USDA-Agricultural Research Service, Grape Genetics Research Unit, Geneva, NY, United States.

出版信息

Front Plant Sci. 2021 Apr 16;12:657240. doi: 10.3389/fpls.2021.657240. eCollection 2021.

Abstract

Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus . Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available.

摘要

短读长序列比对是大多数基因组和转录组分析的基础步骤,但并非所有工具的性能都相同,在越来越多的可用工具中进行选择可能令人望而生畏。在此,为了提高研究界的认识,我们以一种生物信息学经验有限的生物学家能够理解的方式讨论常用算法和程序的优点。我们只会顺便考虑数据清理的影响,这是大多数比对工具的前置分析,并且不会考虑比对片段的下游处理。为了比较比对工具[Bowtie2、Burrows Wheeler比对器(BWA)、HISAT2、MUMmer4、STAR和TopHat2],使用了一个RNA测序数据集,其中包含来自葡萄白粉病菌48个地理上不同样本的数据。基于比对率和基因覆盖率,除了被HISAT2取代的TopHat2外,所有比对工具都表现良好。在这些指标中,BWA可能表现最佳,但对于较长的转录本(>500 bp),HISAT2和STAR表现良好。HISAT2在运行时比第二快的比对工具快约3倍,我们认为这在大多数比对中是一个次要因素。最后,这种对常用比对工具的直接比较说明了在为特定测序数据和目标选择使用哪种工具时的关键考虑因素。没有单一工具能满足每个用户的所有需求,并且有许多高质量的比对工具可供使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae37/8087178/d48dd311d9d0/fpls-12-657240-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验