Suppr超能文献

HTSQualC 是一款灵活的高通量测序数据分析一步式质量控制软件。

HTSQualC is a flexible and one-step quality control software for high-throughput sequencing data analysis.

机构信息

Texas A&M AgriLife Research and Extension Center, Texas A&M University, Weslaco, TX, USA.

Department of Horticultural Science, Texas A&M University, College Station, TX, USA.

出版信息

Sci Rep. 2021 Sep 21;11(1):18725. doi: 10.1038/s41598-021-98124-3.

Abstract

Use of high-throughput sequencing (HTS) has become indispensable in life science research. Raw HTS data contains several sequencing artifacts, and as a first step it is imperative to remove the artifacts for reliable downstream bioinformatics analysis. Although there are multiple stand-alone tools available that can perform the various quality control steps separately, availability of an integrated tool that can allow one-step, automated quality control analysis of HTS datasets will significantly enhance handling large number of samples parallelly. Here, we developed HTSQualC, a stand-alone, flexible, and easy-to-use software for one-step quality control analysis of raw HTS data. HTSQualC can evaluate HTS data quality and perform filtering and trimming analysis in a single run. We evaluated the performance of HTSQualC for conducting batch analysis of HTS datasets with 322 samples with an average ~ 1 M (paired end) sequence reads per sample. HTSQualC accomplished the QC analysis in ~ 3 h in distributed mode and ~ 31 h in shared mode, thus underscoring its utility and robust performance. In addition to command-line execution, we integrated HTSQualC into the free, open-source, CyVerse cyberinfrastructure resource as a GUI interface, for wider access to experimental biologists who have limited computational resources and/or programming abilities.

摘要

高通量测序(HTS)的使用在生命科学研究中已不可或缺。原始 HTS 数据包含多种测序伪影,因此,首要步骤是去除伪影,以进行可靠的下游生物信息学分析。虽然有多个独立的工具可分别执行各种质量控制步骤,但如果有一个集成的工具可允许对 HTS 数据集进行一步式、自动化的质量控制分析,将极大地增强并行处理大量样本的能力。在这里,我们开发了 HTSQualC,这是一个独立的、灵活的、易于使用的软件,可用于原始 HTS 数据的一步式质量控制分析。HTSQualC 可以在单个运行中评估 HTS 数据的质量,并执行过滤和修剪分析。我们评估了 HTSQualC 对 322 个样本(平均每个样本约 1M 个(配对末端)序列读数)的 HTS 数据集进行批量分析的性能。HTSQualC 在分布式模式下完成 QC 分析需要约 3 小时,在共享模式下需要约 31 小时,这突显了其实用性和稳健的性能。除了命令行执行外,我们还将 HTSQualC 集成到免费的开源 CyVerse 网络基础设施资源中,作为一个 GUI 界面,以方便更多计算资源有限或编程能力有限的实验生物学家使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db9a/8455540/e68e26b869e4/41598_2021_98124_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验