Suppr超能文献

ALCHEMY:一种适用于小批次和高度纯合群体的 SNP 基因型自动调用的可靠方法。

ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations.

机构信息

Department of Biological Statistics and Computational Biology, 102 Weill Hall, Cornell University, Ithaca, NY 14853, USA.

出版信息

Bioinformatics. 2010 Dec 1;26(23):2952-60. doi: 10.1093/bioinformatics/btq533. Epub 2010 Oct 5.

Abstract

MOTIVATION

The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster.

RESULTS

As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called 'ALCHEMY' based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples.

AVAILABILITY

ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/

CONTACT

mhw6@cornell.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

开发新的高通量基因分型产品需要大量的测试和训练样本,以便在新产品能够可靠地用于新样本之前对其进行评估和优化。其中一个原因是,当前用于自动调用基因型的方法是基于聚类方法,这些方法需要同时分析大量的样本,或者需要一个广泛的训练数据集来启动聚类。在以近交样本为主要关注点的系统中,由于无法清楚地识别杂合子聚类,当前的聚类方法表现不佳。

结果

作为为 Oryza sativa(家稻)开发两种定制的单核苷酸多态性基因分型产品的一部分,我们开发了一种新的基因型调用算法,称为“ALCHEMY”,它基于原始强度数据的统计建模,而不是无模型聚类。该模型的一个新特点是能够估计和整合每个样本的近交信息,从而允许对近交和杂合样本进行准确的基因分型,即使同时进行分析。由于没有显式使用聚类,因此 ALCHEMY 在小样本量下表现良好,准确性超过 99%,甚至可以使用少至 18 个样本。

可用性

ALCHEMY 可供商业和学术用途免费使用,并根据 GNU 通用公共许可证在 http://alchemy.sourceforge.net/ 分发。

联系人

mhw6@cornell.edu

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7c2e/2982150/51590cbcfb58/btq533f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验