Endrizzi M, Huang S, Scharf J M, Kelter A R, Wirth B, Kunkel L M, Miller W, Dietrich W F
Department of Genetics, Harvard Medical School, 200 Longwood Avenue, Boston, Massachusetts 02115, USA.
Genomics. 1999 Sep 1;60(2):137-51. doi: 10.1006/geno.1999.5910.
Human chromosome 5q11.2-q13.3 and its ortholog on mouse chromosome 13 contain candidate genes for an inherited human neurodegenerative disorder called spinal muscular atrophy (SMA) and for an inherited mouse susceptibility to infection with Legionella pneumophila (Lgn1). These homologous genomic regions also have unusual repetitive organizations that create practical difficulties in mapping and raise interesting issues about the evolutionary origin of the repeats. In an attempt to analyze this region in detail, and as a way to identify additional candidate genes for these diseases, we have determined the sequence of 179 kb of the mouse Lgn1/SMA interval. We have analyzed this sequence using BLAST searches and various exon prediction programs to identify potential genes. Since these methods can generate false-positive exon declarations, our alignments of the mouse sequence with available human orthologous sequence allowed us to discriminate rapidly among this collection of potential coding regions by indicating which regions were well conserved and were more likely to represent actual coding sequence. As a result of our analysis, we accurately mapped two additional genes in the SMA interval that can be tested for involvement in the pathogenesis of SMA. While no new Lgn1 candidates emerged, we have identified new genetic markers that exclude Smn as an Lgn1 candidate. In addition to providing important resources for studying SMA and Lgn1, our data provide further evidence of the value of sequencing the mouse genome as a means to help with the annotation of the human genomic sequence and vice versa.
人类5号染色体的11.2 - q13.3区域及其在小鼠13号染色体上的直系同源区域,包含了与一种名为脊髓性肌萎缩症(SMA)的遗传性人类神经退行性疾病以及小鼠对嗜肺军团菌感染的遗传性易感性(Lgn1)相关的候选基因。这些同源基因组区域还具有不寻常的重复结构,这给图谱绘制带来了实际困难,并引发了关于这些重复序列进化起源的有趣问题。为了详细分析该区域,并以此来识别这些疾病的其他候选基因,我们测定了小鼠Lgn1/SMA区间179 kb的序列。我们使用BLAST搜索和各种外显子预测程序对该序列进行了分析,以识别潜在基因。由于这些方法可能会产生假阳性的外显子声明,我们将小鼠序列与现有的人类直系同源序列进行比对,通过指出哪些区域高度保守且更有可能代表实际编码序列,从而使我们能够在这一系列潜在编码区域中快速进行区分。通过我们的分析,我们在SMA区间准确地定位了另外两个基因,可对其是否参与SMA的发病机制进行检测。虽然没有出现新的Lgn1候选基因,但我们识别出了新的遗传标记,排除了Smn作为Lgn1候选基因的可能性。除了为研究SMA和Lgn1提供重要资源外,我们的数据还进一步证明了对小鼠基因组进行测序对于帮助注释人类基因组序列具有重要价值,反之亦然。