Cole Charlotte G, McCann Owen T, Collins John E, Oliver Karen, Willey David, Gribble Susan M, Yang Fengtang, McLaren Karen, Rogers Jane, Ning Zemin, Beare David M, Dunham Ian
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Genome Biol. 2008;9(5):R78. doi: 10.1186/gb-2008-9-5-r78. Epub 2008 May 13.
Although the human genome sequence was declared complete in 2004, the sequence was interrupted by 341 gaps of which 308 lay in an estimated approximately 28 Mb of euchromatin. While these gaps constitute only approximately 1% of the sequence, knowledge of the full complement of human genes and regulatory elements is incomplete without their sequences.
We have used a combination of conventional chromosome walking (aided by the availability of end sequences) in fosmid and bacterial artificial chromosome (BAC) libraries, whole chromosome shotgun sequencing, comparative genome analysis and long PCR to finish 8 of the 11 gaps in the initial chromosome 22 sequence. In addition, we have patched four regions of the initial sequence where the original clones were found to be deleted, or contained a deletion allele of a known gene, with a further 126 kb of new sequence. Over 1.018 Mb of new sequence has been generated to extend into and close the gaps, and we have annotated 16 new or extended gene structures and one pseudogene.
Thus, we have made significant progress to completing the sequence of the euchromatic regions of human chromosome 22 using a combination of detailed approaches. Our experience suggests that substantial work remains to close the outstanding gaps in the human genome sequence.
尽管人类基因组序列于2004年宣布完成,但该序列被341个缺口打断,其中308个位于估计约28 Mb的常染色质中。虽然这些缺口仅占序列的约1%,但没有它们的序列,人类基因和调控元件的完整互补知识就是不完整的。
我们结合使用了常规染色体步移(借助末端序列),在fosmid和细菌人工染色体(BAC)文库中进行,全染色体鸟枪法测序,比较基因组分析和长PCR,完成了最初22号染色体序列中11个缺口中的8个。此外,我们用另外126 kb的新序列填补了最初序列的四个区域,在这些区域中发现原始克隆被删除,或包含已知基因的缺失等位基因。已经产生了超过1.018 Mb的新序列以延伸并填补缺口,并且我们注释了16个新的或延伸的基因结构和一个假基因。
因此,我们通过结合详细的方法在完成人类22号染色体常染色质区域的序列方面取得了重大进展。我们的经验表明,仍有大量工作要做,以填补人类基因组序列中剩余的缺口。