Insect Biochem Mol Biol. 2008 Dec;38(12):1036-45. doi: 10.1016/j.ibmb.2008.11.004. Epub 2008 Dec 16.
Bombyx mori, the domesticated silkworm, is a major insect model for research, and the first lepidopteran for which draft genome sequences became available in 2004. Two independent data sets from whole-genome shotgun sequencing were merged and assembled together with newly obtained fosmid- and BAC-end sequences. The remarkably improved new assembly is presented here. The 8.5-fold sequence coverage of an estimated 432 Mb genome was assembled into scaffolds with an N50 size of approximately 3.7 Mb; the largest scaffold was 14.5 million base pairs. With help of a high-density SNP linkage map, we anchored 87% of the scaffold sequences to all 28 chromosomes. A particular feature was the high repetitive sequence content estimated to be 43.6% and that consisted mainly of transposable elements. We predicted 14,623 gene models based on a GLEAN-based algorithm, a more accurate prediction than the previous gene models for this species. Over three thousand silkworm genes have no homologs in other insect or vertebrate genomes. Some insights into gene evolution and into characteristic biological processes are presented here and in other papers in this issue. The massive silk production correlates with the existence of specific tRNA clusters, and of several sericin genes assembled in a cluster. The silkworm's adaptation to feeding on mulberry leaves, which contain toxic alkaloids, is likely linked to the presence of new-type sucrase genes, apparently acquired from bacteria. The silkworm genome also revealed the cascade of genes involved in the juvenile hormone biosynthesis pathway, and a large number of cuticular protein genes.
家蚕,即驯化后的蚕,是一种主要的昆虫研究模型,也是在2004年首个获得基因组草图序列的鳞翅目昆虫。来自全基因组鸟枪法测序的两个独立数据集与新获得的fosmid和BAC末端序列合并并组装在一起。本文展示了显著改进后的新组装结果。估计432 Mb基因组的8.5倍序列覆盖率被组装成N50大小约为3.7 Mb的支架;最大的支架为1450万个碱基对。借助高密度SNP连锁图谱,我们将87%的支架序列定位到了所有28条染色体上。一个特别的特征是重复序列含量高,估计为43.6%,且主要由转座元件组成。我们基于一种基于GLEAN的算法预测了14623个基因模型,比该物种先前的基因模型预测更准确。超过三千个家蚕基因在其他昆虫或脊椎动物基因组中没有同源物。本文以及本期其他论文展示了对基因进化和特征性生物学过程的一些见解。大量的蚕丝生产与特定tRNA簇的存在以及成簇组装的几个丝胶蛋白基因有关。家蚕对以含有有毒生物碱的桑叶为食的适应性可能与新型蔗糖酶基因的存在有关,这些基因显然是从细菌中获得的。家蚕基因组还揭示了参与保幼激素生物合成途径的一系列基因以及大量的表皮蛋白基因。