Deutsch S, Iseli C, Bucher P, Antonarakis S E, Scott H S
Division of Medical Genetics, University of Geneva Medical School, Geneva, Switzerland.
Genome Res. 2001 Feb;11(2):300-7. doi: 10.1101/gr.164901.
Single nucleotide polymorphisms (SNPs) are likely to contribute to the study of complex genetic diseases. The genomic sequence of human chromosome 21q was recently completed with 225 annotated genes, thus permitting efficient identification and precise mapping of potential cSNPs by bioinformatics approaches. Here we present a human chromosome 21 (HC21) cSNP database and the first chromosome-specific cSNP map. Potential cSNPs were generated using three approaches: (1) Alignment of the complete HC21 genomic sequence to cognate ESTs and mRNAs. Candidate cSNPs were automatically extracted using a novel program for context-dependent SNP identification that efficiently discriminates between true variation, poor quality sequencing, and paralogous gene alignments. (2) Multiple alignment of all known HC21 genes to all other human database entries. (3) Gene-targeted cSNP discovery. To date we have identified 377 cSNPs averaging ~1 SNP per 1.5 kb of transcribed sequence, covering 65% of known genes in the chromosome. Validation of our bioinformatics approach was demonstrated by a confirmation rate of 78% for the predicted cSNPs, and in total 32% of the cSNPs in our database have been confirmed. The database is publicly available at http://csnp.unige.ch or http://csnp.isb-sib.ch. These SNPs provide a tool to study the contribution of HC21 loci to complex diseases such as bipolar affective disorder and allele-specific contributions to Down syndrome phenotypes.
单核苷酸多态性(SNPs)可能有助于复杂遗传性疾病的研究。人类21号染色体q臂的基因组序列最近已完成,其中有225个注释基因,因此可通过生物信息学方法高效识别和精确绘制潜在的编码单核苷酸多态性(cSNPs)。在此,我们展示了一个人类21号染色体(HC21)cSNP数据库以及首张染色体特异性cSNP图谱。潜在的cSNPs通过三种方法生成:(1)将完整的HC21基因组序列与同源的ESTs和mRNAs进行比对。使用一个用于上下文相关SNP识别的新程序自动提取候选cSNPs,该程序能有效区分真实变异、低质量测序和平行基因比对。(2)将所有已知的HC21基因与所有其他人类数据库条目进行多重比对。(3)基因靶向cSNP发现。迄今为止,我们已识别出377个cSNPs,平均每1.5 kb转录序列约有1个SNP,覆盖了该染色体上65%的已知基因。对预测的cSNPs的确认率为78%,证明了我们生物信息学方法的有效性,并且我们数据库中总共32%的cSNPs已得到确认。该数据库可在http://csnp.unige.ch或http://csnp.isb-sib.ch上公开获取。这些SNPs为研究HC21基因座对双相情感障碍等复杂疾病的贡献以及对唐氏综合征表型的等位基因特异性贡献提供了一个工具。