与大鼠表达序列相关的单核苷酸多态性。

Single nucleotide polymorphisms associated with rat expressed sequences.

作者信息

Guryev Victor, Berezikov Eugene, Malik Rainer, Plasterk Ronald H A, Cuppen Edwin

机构信息

Hubrecht Laboratory, Netherlands Institute for Developmental Biology, Uppsalalaan 8, 3584CT, Utrecht, The Netherlands.

出版信息

Genome Res. 2004 Jul;14(7):1438-43. doi: 10.1101/gr.2154304.

Abstract

Single nucleotide polymorphisms (SNPs) are the most common source of genetic variation in populations and are thus most likely to account for the majority of phenotypic and behavioral differences between individuals or strains. Although the rat is extensively studied for the latter, data on naturally occurring polymorphisms are mostly lacking. We have used publicly available sequences consisting of whole-genome shotgun (WGS), expressed sequence tag (EST), and mRNA data as a source for the in silico identification of SNPs in gene-coding regions and have identified a large collection of 33,305 high-quality candidate SNPs. Experimental verification of 471 candidate SNPs using a limited set of rat isolates revealed a confirmation rate of approximately 50%. Although the majority of SNPs were identified between Sprague-Dawley (EST data) and Brown Norway (WGS data) strains, we found that 66% of the verified variations are common among different rat strains. All SNPs were extensively annotated, including chromosomal and genetic map information, and nonsynonymous SNPs were analyzed by SIFT and PolyPhen prediction programs for their potential deleterious effect on protein function. Interestingly, we retrieved three SNPs from the database that result in the introduction of a premature stop codon and that could be confirmed experimentally. Two of these "in silico-identified knockouts" reside in interesting QTL regions. Data are publicly available via a Web interface (http://cascad.niob.knaw.nl), allowing simple and advanced search queries.

摘要

单核苷酸多态性(SNPs)是群体中最常见的遗传变异来源,因此最有可能解释个体或品系之间大多数的表型和行为差异。尽管大鼠已被广泛用于研究后者,但关于自然发生的多态性的数据大多缺乏。我们利用公开可用的全基因组鸟枪法测序(WGS)、表达序列标签(EST)和mRNA数据作为在基因编码区进行SNPs计算机识别的来源,并鉴定出了一大组33305个高质量的候选SNPs。使用一组有限的大鼠分离株对471个候选SNPs进行实验验证,结果显示确认率约为50%。尽管大多数SNPs是在斯普拉格-道利大鼠(EST数据)和挪威棕大鼠(WGS数据)品系之间鉴定出来的,但我们发现66%的已验证变异在不同大鼠品系中很常见。所有SNPs都进行了广泛注释,包括染色体和遗传图谱信息,非同义SNPs通过SIFT和PolyPhen预测程序分析其对蛋白质功能的潜在有害影响。有趣的是,我们从数据库中检索到三个导致引入提前终止密码子且可通过实验确认的SNPs。其中两个“计算机识别的敲除突变”位于有趣的数量性状基因座区域。数据可通过网络界面(http://cascad.niob.knaw.nl)公开获取,允许进行简单和高级搜索查询。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索