Department of Genetic Medicine and Development, University of Geneva, Geneva, Switzerland.
Swiss Institute of Bioinformatics, Geneva, Switzerland.
Mol Biol Evol. 2021 Sep 27;38(10):4647-4654. doi: 10.1093/molbev/msab199.
Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly procedures and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here, we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying data sets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phylogenetic placement of the input sequence to automatically select the most appropriate BUSCO data set for the assessment, allowing the analysis of metagenome-assembled genomes of unknown origin. A newly introduced genome workflow increases the efficiency and runtimes especially on large eukaryotic genomes. BUSCO is the only tool capable of assessing both eukaryotic and prokaryotic species, and can be applied to various data types, from genome assemblies and metagenomic bins, to transcriptomes and gene sets.
评估基因组和宏基因组数据质量的方法对于辅助基因组组装程序以及正确解释后续分析的结果至关重要。BUSCO 根据普遍存在的单拷贝直系同源物来估计处理后基因组数据的完整性和冗余性。在这里,我们介绍了 BUSCO 软件的新功能和主要改进,以及与 OrthoDB v10 版本同步更新和扩展的基础数据集。其中的主要创新是,BUSCO 现在可以对输入序列进行系统发育定位,以便自动选择最合适的 BUSCO 数据集进行评估,从而可以分析未知来源的宏基因组组装基因组。新引入的基因组工作流程提高了效率和运行时间,特别是对于大型真核生物基因组。BUSCO 是唯一能够评估真核生物和原核生物物种的工具,并且可以应用于各种数据类型,从基因组组装和宏基因组到转录组和基因集。