Picoult-Newberg L, Ideker T E, Pohl M G, Taylor S L, Donaldson M A, Nickerson D A, Boyce-Jacino M
Orchid Biocomputer, Inc., Alpha Center; Johns Hopkins Bayview Research Campus, Baltimore, Maryland 21224 USA.
Genome Res. 1999 Feb;9(2):167-74.
There is considerable interest in the discovery and characterization of single nucleotide polymorphisms (SNPs) to enable the analysis of the potential relationships between human genotype and phenotype. Here we present a strategy that permits the rapid discovery of SNPs from publicly available expressed sequence tag (EST) databases. From a set of ESTs derived from 19 different cDNA libraries, we assembled 300,000 distinct sequences and identified 850 mismatches from contiguous EST data sets (candidate SNP sites), without de novo sequencing. Through a polymerase-mediated, single-base, primer extension technique, Genetic Bit Analysis (GBA), we confirmed the presence of a subset of these candidate SNP sites and have estimated the allele frequencies in three human populations with different ethnic origins. Altogether, our approach provides a basis for rapid and efficient regional and genome-wide SNP discovery using data assembled from sequences from different libraries of cDNAs.
人们对单核苷酸多态性(SNP)的发现和特征描述有着浓厚兴趣,以便能够分析人类基因型与表型之间的潜在关系。在此,我们提出一种策略,可从公开可用的表达序列标签(EST)数据库中快速发现SNP。从19个不同cDNA文库获得的一组EST中,我们组装了300,000个不同序列,并从连续的EST数据集(候选SNP位点)中鉴定出850个错配,无需从头测序。通过一种聚合酶介导的单碱基引物延伸技术——遗传比特分析(GBA),我们证实了这些候选SNP位点中的一部分确实存在,并估计了三个不同种族起源的人类群体中的等位基因频率。总之,我们的方法为利用从不同cDNA文库的序列组装的数据进行快速有效的区域和全基因组SNP发现提供了基础。