Suppr超能文献

基因分型错误、系谱错误和数据缺失。

Genotyping errors, pedigree errors, and missing data.

作者信息

Hinrichs Anthony L, Suarez Brian K

机构信息

Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri 63110, USA.

出版信息

Genet Epidemiol. 2005;29 Suppl 1:S120-4. doi: 10.1002/gepi.20120.

Abstract

Our group studied the effects of genotyping errors, pedigree errors, and missing data on a wide range of techniques, with a focus on the role of single-nucleotide polymorphisms (SNPs). Half of our group used simulated data, and half of our group used data from the Collaborative Study on the Genetics of Alcoholism (COGA). The simulated data had no missing genotypes and no genotyping errors, so our group, as a whole, removed data and introduced artificial errors to study the robustness of various techniques. Our teams showed that genotyping errors are less detectable and may have a greater impact on SNPs than on microsatellites, but recently developed methods that account for genotyping errors help reduce false positives, and the assumptions of these methods appear to be supported by observations from repeated genotyping. The ability to detect linkage disequilibrium (LD) was also substantially reduced by missing data; this in turn could affect tagging SNPs chosen to generate haplotypes. In the COGA sample, genotyping measurements were repeated in three ways. First, full-genome screens were performed on three sets of markers: 328 microsatellites, 11,560 SNPs from the Affymetrix GeneChip Mapping 10 K Array marker set, and 4,720 SNPs from the Illumina Linkage III panel. Second, the entire Affymetrix marker set was typed on the same 184 individuals by two different laboratories. Finally, the Affymetrix and Illumina marker panels had 94 SNPs in common. Our teams showed that both SNPs and microsatellites can be readily used to identify pedigree errors, and that SNPs have fewer genotyping errors and a low inconsistency rate. However, a fairly high rate of no-calls, especially for the Affymetrix platform, suggests that the inconsistency rate may be higher than observed.

摘要

我们团队研究了基因分型错误、系谱错误和缺失数据对多种技术的影响,重点关注单核苷酸多态性(SNP)的作用。我们团队一半成员使用模拟数据,另一半使用来自酒精中毒遗传学合作研究(COGA)的数据。模拟数据没有缺失基因型和基因分型错误,因此我们团队整体上通过去除数据并引入人工错误来研究各种技术的稳健性。我们的团队表明,基因分型错误更难被检测到,并且对SNP的影响可能比对微卫星的影响更大,但最近开发的考虑基因分型错误的方法有助于减少假阳性,而且这些方法的假设似乎得到了重复基因分型观察结果的支持。缺失数据也大幅降低了检测连锁不平衡(LD)的能力;这反过来可能会影响为生成单倍型而选择的标签SNP。在COGA样本中,基因分型测量以三种方式重复进行。首先,对三组标记进行全基因组筛选:328个微卫星、来自Affymetrix GeneChip Mapping 10 K Array标记集的11,560个SNP以及来自Illumina Linkage III面板的4,720个SNP。其次,由两个不同实验室对相同的184名个体进行整个Affymetrix标记集的分型。最后,Affymetrix和Illumina标记面板有94个SNP是相同的。我们的团队表明,SNP和微卫星都可以很容易地用于识别系谱错误,并且SNP的基因分型错误较少且不一致率较低。然而,相当高的无呼叫率,尤其是对于Affymetrix平台而言,表明不一致率可能比观察到的更高。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验