Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
PLoS Genet. 2009 Dec;5(12):e1000759. doi: 10.1371/journal.pgen.1000759. Epub 2009 Dec 11.
An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000 annotated exons and boundaries from over 900 genes in 41 recessive mutant mouse lines that were isolated in an N-ethyl-N-nitrosourea (ENU) mutation screen targeted to mouse Chromosome 11. Fifty-nine sequence variants were identified in 55 genes from 31 mutant lines. 39% of the lesions lie in coding sequences and create primarily missense mutations. The other 61% lie in noncoding regions, many of them in highly conserved sequences. A lesion in the perinatal lethal line l11Jus13 alters a consensus splice site of nucleoredoxin (Nxn), inserting 10 amino acids into the resulting protein. We conclude that point mutations can be accurately and sensitively recovered by large-scale sequencing, and that conserved noncoding regions should be included for disease mutation identification. Only seven of the candidate genes we report have been previously targeted by mutation in mice or rats, showing that despite ongoing efforts to functionally annotate genes in the mammalian genome, an enormous gap remains between phenotype and function. Our data show that the classical positional mapping approach of disease mutation identification can be extended to large target regions using high-throughput sequencing.
准确而精确注释的基因组组装是功能基因组分析的基本要求。在这里,使用小鼠 11 号染色体的完整 DNA 序列和基因注释来测试大规模测序识别突变的效果。我们重新测序了 41 条隐性突变小鼠系中超过 900 个基因的 14000 个注释外显子和边界,这些突变系是在针对小鼠 11 号染色体的 N-乙基-N-亚硝脲(ENU)突变筛选中分离出来的。在 31 条突变系的 55 个基因中发现了 59 个序列变异。31 个突变系中的 55 个基因中有 59 个序列变异。39%的病变位于编码序列中,主要导致错义突变。其他 61%位于非编码区域,其中许多位于高度保守的序列中。在围产期致死线 l11Jus13 中,病变改变了核还原蛋白(Nxn)的一个共识剪接位点,在产生的蛋白质中插入了 10 个氨基酸。我们得出结论,通过大规模测序可以准确而敏感地恢复点突变,并且应该包括保守的非编码区域以识别疾病突变。我们报告的候选基因中只有 7 个以前在小鼠或大鼠中发生过突变,这表明尽管人们一直在努力对哺乳动物基因组中的基因进行功能注释,但表型和功能之间仍然存在巨大差距。我们的数据表明,使用高通量测序可以将疾病突变识别的经典定位映射方法扩展到大型目标区域。