Lilue Jingtao, Corvelo André, Gómez-Garrido Jèssica, Akagi Keiko, Yang Fengtang, Green Gia, Ng Bee Ling, Fu Beiyuan, Chorostecki Uciel Pablo, Warner Sarah C, Marcet-Houben Marina, Keane Thomas M, Mullikin James C, Alioto Tyler, Gabaldón Toni, Hubert Benjamin, Symer David E, Niewiesk Stefan
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
The Instituto Gulbenkian de Ciência, Oeiras, Portugal.
BMC Biol. 2025 Jul 18;23(1):217. doi: 10.1186/s12915-025-02316-6.
The cotton rat (Sigmodon hispidus), a rodent species native to the Americas, has emerged as a valuable laboratory model of infections by numerous human pathogens including poliovirus and respiratory syncytial virus (RSV).
Here we report the first reference assembly of the cotton rat genome organized at a chromosomal level, providing annotation of 24,878 protein-coding genes. Data from PCR-free whole genome sequencing, linked-read sequencing, and RNA sequencing from pooled cotton rat tissues were analyzed to assemble and annotate this novel genome sequence. Spectral karyotyping data using fluorescent probes derived from mouse chromosomes facilitated the assignment of cotton rat orthologs to syntenic chromosomes, comprising 25 autosomes and a sex chromosome in the haploid genome. Comparative phylome analysis revealed both gains and losses of numerous genes including immune defense genes against pathogens. We identified thousands of recently retrotransposed L1, SINE B2, and ERV elements, revealing widespread genomic insertions unique to this species.
We anticipate that annotation and characterization of the first chromosome-level cotton rat genome assembly as described here will enable and accelerate ongoing investigations into its host defenses against viral and other pathogens, genome biology, and mammalian evolution.
棉鼠(Sigmodon hispidus)是一种原产于美洲的啮齿动物,已成为包括脊髓灰质炎病毒和呼吸道合胞病毒(RSV)在内的多种人类病原体感染的重要实验室模型。
在此,我们报告了首个在染色体水平上组装的棉鼠基因组参考序列,注释了24,878个蛋白质编码基因。分析了来自无PCR全基因组测序、连接读数测序以及来自混合棉鼠组织的RNA测序数据,以组装和注释这个新的基因组序列。使用源自小鼠染色体的荧光探针进行的光谱核型分析数据有助于将棉鼠直系同源基因分配到同线染色体上,单倍体基因组包括25条常染色体和一条性染色体。比较系统发育分析揭示了包括针对病原体的免疫防御基因在内的许多基因的获得和丢失。我们鉴定出数千个最近逆转座的L1、SINE B2和ERV元件,揭示了该物种特有的广泛基因组插入。
我们预计,本文所述的首个染色体水平的棉鼠基因组组装的注释和表征将促进并加速对其针对病毒和其他病原体的宿主防御、基因组生物学以及哺乳动物进化的正在进行的研究。