Evolution & Ecology Research Centre, School of Biological, Earth and Environmental Sciences, UNSW Sydney, Sydney, New South Wales, Australia.
Evolution & Ecology Research Centre, School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, Australia.
Mol Ecol Resour. 2022 Nov;22(8):3141-3160. doi: 10.1111/1755-0998.13679. Epub 2022 Jul 18.
The European starling, Sturnus vulgaris, is an ecologically significant, globally invasive avian species that is also suffering from a major decline in its native range. Here, we present the genome assembly and long-read transcriptome of an Australian-sourced European starling (S. vulgaris vAU), and a second, North American, short-read genome assembly (S. vulgaris vNA), as complementary reference genomes for population genetic and evolutionary characterization. S. vulgaris vAU combined 10× genomics linked-reads, low-coverage Nanopore sequencing, and PacBio Iso-Seq full-length transcript scaffolding to generate a 1050 Mb assembly on 6222 scaffolds (7.6 Mb scaffold N50, 94.6% busco completeness). Further scaffolding against the high-quality zebra finch (Taeniopygia guttata) genome assigned 98.6% of the assembly to 32 putative nuclear chromosome scaffolds. Species-specific transcript mapping and gene annotation revealed good gene-level assembly and high functional completeness. Using S. vulgaris vAU, we demonstrate how the multifunctional use of PacBio Iso-Seq transcript data and complementary homology-based annotation of sequential assembly steps (assessed using a new tool, saaga) can be used to assess, inform, and validate assembly workflow decisions. We also highlight some counterintuitive behaviour in traditional busco metrics, and present buscomp, a complementary tool for assembly comparison designed to be robust to differences in assembly size and base-calling quality. This work expands our knowledge of avian genomes and the available toolkit for assessing and improving genome quality. The new genomic resources presented will facilitate further global genomic and transcriptomic analysis on this ecologically important species.
欧洲椋鸟(Sturnus vulgaris)是一种具有重要生态学意义的全球性入侵鸟类物种,其在原生范围内也正在经历重大衰退。在这里,我们提供了一个澳大利亚来源的欧洲椋鸟(S. vulgaris vAU)的基因组组装和长读长转录组,以及第二个北美短读长基因组组装(S. vulgaris vNA),作为种群遗传和进化特征描述的补充参考基因组。S. vulgaris vAU 结合了 10×基因组链接读取、低覆盖度纳米孔测序和 PacBio Iso-Seq 全长转录支架,生成了一个 1050 Mb 的组装,由 6222 个支架组成(7.6 Mb 支架 N50,94.6% BUSCO 完整性)。进一步针对高质量斑马雀(Taeniopygia guttata)基因组进行支架构建,将组装的 98.6%分配到 32 个假定核染色体支架上。物种特异性转录图谱和基因注释显示了良好的基因级组装和高功能完整性。使用 S. vulgaris vAU,我们展示了如何多功能地使用 PacBio Iso-Seq 转录数据,以及通过互补的同源性注释顺序组装步骤(使用新工具 saaga 评估)来评估、提供信息和验证组装工作流程决策。我们还强调了传统 BUSCO 指标中的一些违反直觉的行为,并提出了 buscomp,这是一种用于组装比较的互补工具,旨在对组装大小和碱基调用质量的差异具有稳健性。这项工作扩展了我们对鸟类基因组的认识,以及用于评估和提高基因组质量的可用工具包。所提供的新基因组资源将促进对这个具有重要生态学意义的物种进行进一步的全球基因组和转录组分析。