Vergara Ismael A, Mah Allan K, Huang Jim C, Tarailo-Graovac Maja, Johnsen Robert C, Baillie David L, Chen Nansheng
Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
BMC Genomics. 2009 Jul 21;10:329. doi: 10.1186/1471-2164-10-329.
The nematode Caenorhabditis elegans was the first multicellular organism to have its genome fully sequenced. Over the last 10 years since the original publication in 1998, the C. elegans genome has been scrutinized and the last gaps were filled in November 2002, which present a unique opportunity for examining genome-wide segmental duplications.
Here, we performed analysis of the C. elegans genome in search for segmental duplications using a new tool -- OrthoCluster -- we have recently developed. We detected 3,484 duplicated segments -- duplicons -- ranging in size from 234 bp to 108 Kb. The largest pair of duplicons, 108 kb in length located on the left arm of Chromosome V, was further characterized. They are nearly identical at the DNA level (99.7% identity) and each duplicon contains 26 putative protein coding genes. Genotyping of 76 wild-type strains obtained from different labs in the C. elegans community revealed that not all strains contain this duplication. In fact, only 29 strains carry this large segmental duplication, suggesting a very recent duplication event in the C. elegans genome.
This report represents the first demonstration that the C. elegans laboratory wild-type N2 strains has acquired large-scale differences.
线虫秀丽隐杆线虫是首个基因组被完全测序的多细胞生物。自1998年首次发表以来的过去10年里,秀丽隐杆线虫基因组受到了仔细研究,最后的缺口于2002年11月被填补,这为研究全基因组片段重复提供了独特的机会。
在此,我们使用我们最近开发的一种新工具——OrthoCluster,对线虫基因组进行了片段重复搜索分析。我们检测到3484个重复片段——重复子,大小从234碱基对到108千碱基不等。位于第五条染色体左臂上、长度为108千碱基的最大一对重复子得到了进一步表征。它们在DNA水平上几乎相同(99.7%的同一性),每个重复子包含26个推定的蛋白质编码基因。对从秀丽隐杆线虫群体中不同实验室获得的76个野生型菌株进行基因分型发现,并非所有菌株都含有这种重复。事实上,只有29个菌株携带这种大片段重复,这表明秀丽隐杆线虫基因组中发生了非常近期的重复事件。
本报告首次证明秀丽隐杆线虫实验室野生型N2菌株存在大规模差异。