Suppr超能文献

Affymetrix基因芯片人类定位100K单核苷酸多态性(SNP)集的覆盖范围和特征。

Coverage and characteristics of the Affymetrix GeneChip Human Mapping 100K SNP set.

作者信息

Nicolae Dan L, Wen Xiaoquan, Voight Benjamin F, Cox Nancy J

机构信息

Department of Statistics, The University of Chicago, Chicago, Illinois, USA.

出版信息

PLoS Genet. 2006 May;2(5):e67. doi: 10.1371/journal.pgen.0020067. Epub 2006 May 5.

Abstract

Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs). In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous) and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD) coefficient based on information content (analogous to the information content scores commonly used for linkage mapping) that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.

摘要

技术的进步使得学术研究人员能够在可承受的成本范围内进行全基因组关联图谱分析,目前正在使用各种高通量平台开展实验。为了给解释此类研究结果提供一个合适的背景,我们在此总结了对最早公开可用的此类技术之一——Affymetrix基因芯片人类图谱100K单核苷酸多态性(SNP)集——的一项调查结果。在对100K图谱集中SNP的模式和分布进行系统分析时,我们发现,相对于整个HapMap数据库中的SNP而言,该图谱集中的SNP在编码区(非同义及同义)的采样不足,而在基因外区域的采样过度。此外,我们基于信息含量利用了一种新的多位点连锁不平衡(LD)系数(类似于常用于连锁图谱分析的信息含量得分),在两个位点的特殊情况下,该系数等同于常用的r2测量值。使用这种方法,我们能够针对任何标记子集(如Affymetrix 图谱100K集)总结相对于HapMap中包含的全部标记子集中可用于关联图谱分析的信息,并突出显示这种多位点LD测量相对于成对LD测量能为一个区域的单倍型结构提供更多深入见解的情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4212/1464819/2379dd3cf14b/pgen.0020067.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验