Suppr超能文献

一种用于Affymetrix SNP阵列的基因型分型算法。

A genotype calling algorithm for affymetrix SNP arrays.

作者信息

Rabbee Nusrat, Speed Terence P

机构信息

Department of Statistics, University of California-Berkeley, Berkeley, CA, USA.

出版信息

Bioinformatics. 2006 Jan 1;22(1):7-12. doi: 10.1093/bioinformatics/bti741. Epub 2005 Nov 2.

Abstract

A classification algorithm, based on a multi-chip, multi-SNP approach is proposed for Affymetrix SNP arrays. Current procedures for calling genotypes on SNP arrays process all the features associated with one chip and one SNP at a time. Using a large training sample where the genotype labels are known, we develop a supervised learning algorithm to obtain more accurate classification results on new data. The method we propose, RLMM, is based on a robustly fitted, linear model and uses the Mahalanobis distance for classification. The chip-to-chip non-biological variance is reduced through normalization. This model-based algorithm captures the similarities across genotype groups and probes, as well as across thousands of SNPs for accurate classification. In this paper, we apply RLMM to Affymetrix 100 K SNP array data, present classification results and compare them with genotype calls obtained from the Affymetrix procedure DM, as well as to the publicly available genotype calls from the HapMap project.

摘要

针对Affymetrix SNP阵列,提出了一种基于多芯片、多单核苷酸多态性(SNP)方法的分类算法。当前SNP阵列上进行基因型分型的程序每次处理与一个芯片和一个SNP相关的所有特征。利用基因型标签已知的大型训练样本,我们开发了一种监督学习算法,以在新数据上获得更准确的分类结果。我们提出的方法RLMM基于稳健拟合的线性模型,并使用马氏距离进行分类。通过归一化减少芯片间的非生物学差异。这种基于模型的算法捕捉基因型组和探针之间以及数千个SNP之间的相似性,以进行准确分类。在本文中,我们将RLMM应用于Affymetrix 100K SNP阵列数据,展示分类结果,并将其与从Affymetrix程序DM获得的基因型分型结果以及HapMap项目公开的基因型分型结果进行比较。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验