高质量蓝鲸基因组、片段重复序列和历史人口动态。
A High-Quality Blue Whale Genome, Segmental Duplications, and Historical Demography.
机构信息
Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA.
Southwest Fisheries Science Center, National Oceanic and Atmospheric Administration (NOAA), La Jolla, CA 92037, USA.
出版信息
Mol Biol Evol. 2024 Mar 1;41(3). doi: 10.1093/molbev/msae036.
The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.
蓝鲸,Balaenoptera musculus,是已知存在过的最大动物,这使它成为研究长寿和抗癌能力的重要案例。为了进一步研究蓝鲸和其他相关物种,我们报告了该迷人物种的基于长读长的参考质量基因组组装。我们使用 PacBio 长读长组装基因组,并利用 Illumina/10×、光学图谱和 Hi-C 数据进行支架、细化和人工编辑。我们还提供了长读 RNA-seq 数据,以促进 NCBI 和 Ensembl 对组装的注释。此外,我们使用 TOGA 注释了两个单倍型,并通过流式细胞术测量了基因组大小。然后,我们将蓝鲸基因组与其他鲸目动物和偶蹄目动物进行了比较,包括世界上最小的鲸目动物——小头鼠海豚,以研究蓝鲸独特的生物学特征。我们发现蓝鲸基因组中几个基因发生了显著扩增,这是由于最近发生的片段重复爆发所致,但这种扩增与巨大体型之间的可能联系需要进一步研究。我们还发现了与鲸目动物体型相关的胰岛素样生长因子-1 基因中的一些位点。最后,我们使用组装好的基因组来研究太平洋和大西洋蓝鲸种群的杂合性和历史种群动态,发现两个种群的基因组都高度杂合,并且它们的遗传隔离可以追溯到上一个间冰期。总之,这些结果表明,高质量注释的蓝鲸基因组将成为生物学、进化和保护研究的重要资源。