SNP500癌症：一个用于候选基因遗传变异序列验证和检测方法开发的公共资源。

SNP500Cancer: a public resource for sequence validation and assay development for genetic variation in candidate genes.

作者信息

Packer Bernice R, Yeager Meredith, Staats Brian, Welch Robert, Crenshaw Andrew, Kiley Maureen, Eckert Andrew, Beerman Michael, Miller Edward, Bergen Andrew, Rothman Nathaniel, Strausberg Robert, Chanock Stephen J

机构信息

Intramural Research Support Program, SAIC-Frederick, NCI-FCRDC, Frederick, MD, USA.

出版信息

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D528-32. doi: 10.1093/nar/gkh005.

DOI:10.1093/nar/gkh005

PMID:14681474

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC308740/

Abstract

The SNP500Cancer Database provides sequence and genotype assay information for candidate single nucleotide polymorphisms (SNPs) useful in mapping complex diseases, such as cancer. The database is an integral component of the NCI's Cancer Genome Anatomy Project. SNP500Cancer provides bi-directional sequencing information on a set of control DNA samples derived from anonymized subjects (102 Coriell samples representing four self-described ethnic groups: African/African-American, Caucasian, Hispanic and Pacific Rim). All SNPs are chosen from public databases and reports, and the choice of genes includes a bias towards non-synonymous and promoter SNPs in genes that have been implicated in one or more cancers. The web site is searchable by gene, chromosome, gene ontology pathway and by known dbSNP ID. As of July 2003, the database contains over 3400 SNPs, 2490 of which have been sequenced in the SNP500Cancer population. For each analyzed SNP, gene location and over 200 bp of surrounding annotated sequence (including nearby SNPs) are provided, with frequency information in total and per subpopulation, and calculation of Hardy-Weinberg Equilibrium (HWE) for each subpopulation. Sequence validated SNPs with minor allele frequency > 5% are entered into a high-throughput pipeline for genotyping analysis to determine concordance for the same 102 samples. The website provides the conditions for validated genotyping assays. SNP500Cancer provides an invaluable resource for investigators to select SNPs for analysis, design genotyping assays using validated sequence data, choose selected assays already validated on one or more genotyping platforms, and select reference standards for genotyping assays. The SNP500Cancer Database is freely accessible via the web page at http://snp500cancer.nci.nih.gov/.

摘要

SNP500癌症数据库提供了候选单核苷酸多态性（SNP）的序列和基因型检测信息，这些SNP有助于绘制复杂疾病（如癌症）的图谱。该数据库是美国国立癌症研究所（NCI）癌症基因组解剖计划的一个重要组成部分。SNP500癌症数据库提供了一组来自匿名受试者的对照DNA样本（102个科里尔样本，代表四个自我描述的种族群体：非洲/非裔美国人、白种人、西班牙裔和环太平洋地区人群）的双向测序信息。所有SNP均选自公共数据库和报告，基因的选择倾向于那些与一种或多种癌症相关的基因中的非同义SNP和启动子SNP。该网站可通过基因、染色体、基因本体途径和已知的dbSNP ID进行搜索。截至2003年7月，该数据库包含超过3400个SNP，其中2490个已在SNP500癌症人群中进行了测序。对于每个分析的SNP，提供了基因位置和超过200 bp的周围注释序列（包括附近的SNP），以及总体和每个亚群的频率信息，并计算了每个亚群的哈迪-温伯格平衡（HWE）。次要等位基因频率>5%的序列验证SNP被输入高通量基因分型分析流程，以确定相同102个样本的一致性。该网站提供了经过验证的基因分型检测的条件。SNP500癌症数据库为研究人员提供了一个宝贵的资源，可用于选择SNP进行分析、使用经过验证的序列数据设计基因分型检测、选择已在一个或多个基因分型平台上经过验证的选定检测，并选择基因分型检测的参考标准。可通过网页http://snp500cancer.nci.nih.gov/免费访问SNP500癌症数据库。