Obinu Lia, Booth Timothy, De Weerd Heleen, Trivedi Urmi, Porceddu Andrea
Department of Agricultural Sciences, University of Sassari, Viale Italia 39/a, Sassari, Sardinia, 07100, Italy.
Edinburgh Genomics, The University of Edinburgh, Ashworth Laboratories, The King's Buildings, Charlotte Auerbach Rd, Edinburgh, Scotland, EH9 3FL, United Kingdom.
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf175.
De novo assembly creates reference genomes that underpin many modern biodiversity and conservation studies. Large numbers of new genomes are being assembled by labs around the world. To avoid duplication of efforts and variable data quality, we desire a best-practice assembly process, implemented as an automated portable workflow.
Here, we present Colora, a Snakemake workflow that produces chromosome-scale de novo primary or phased genome assemblies complete with organelles using Pacific Biosciences HiFi, Hi-C, and optionally Oxford Nanopore Technologies reads as input. Colora is a user-friendly, versatile, and reproducible pipeline that is ready to use by researchers looking for an automated way to obtain high-quality de novo genome assemblies.
The source code of Colora is available on GitHub (https://github.com/LiaOb21/colora) and has been deposited in Zenodo under DOI https://doi.org/10.5281/zenodo.13321576. Colora is also available at the Snakemake Workflow Catalog (https://snakemake.github.io/snakemake-workflow-catalog/? usage=LiaOb21%2Fcolora).
从头组装创建参考基因组,为许多现代生物多样性和保护研究奠定基础。世界各地的实验室正在组装大量新的基因组。为避免重复劳动和数据质量参差不齐,我们需要一个最佳实践的组装流程,并将其实现为自动化的可移植工作流程。
在此,我们展示了Colora,这是一个Snakemake工作流程,它使用太平洋生物科学公司的HiFi、Hi-C以及可选的牛津纳米孔技术读取数据作为输入,生成包含细胞器的染色体规模的从头初级或分阶段基因组组装。Colora是一个用户友好、通用且可重复的管道,可供寻求以自动化方式获得高质量从头基因组组装的研究人员使用。
Colora的源代码可在GitHub上获取(https://github.com/LiaOb21/colora),并已在Zenodo上以DOI https://doi.org/10.5281/zenodo.13321576存档。Colora也可在Snakemake工作流程目录(https://snakemake.github.io/snakemake-workflow-catalog/?usage=LiaOb21%2Fcolora)中获取。