• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于高密度基因分型阵列的贝叶斯高斯混合模型

Bayesian Gaussian Mixture Models for High-Density Genotyping Arrays.

作者信息

Sabatti Chiara, Lange Kenneth

机构信息

Departments of Human Genetics and Statistics, University of California, Los Angeles, CA 90095.

出版信息

J Am Stat Assoc. 2008 Mar 1;103(481):89-100. doi: 10.1198/016214507000000338..

DOI:10.1198/016214507000000338.
PMID:21572926
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3092390/
Abstract

Affymetrix's SNP (single-nucleotide polymorphism) genotyping chips have increased the scope and decreased the cost of gene-mapping studies. Because each SNP is queried by multiple DNA probes, the chips present interesting challenges in genotype calling. Traditional clustering methods distinguish the three genotypes of an SNP fairly well given a large enough sample of unrelated individuals or a training sample of known genotypes. This article describes our attempt to improve genotype calling by constructing Gaussian mixture models with empirically derived priors. The priors stabilize parameter estimation and borrow information collectively gathered on tens of thousands of SNPs. When data from related family members are available, our models capture the correlations in signals between relatives. With these advantages in mind, we apply the models to Affymetrix probe intensity data on 10,000 SNPs gathered on 63 genotyped individuals spread over eight pedigrees. We integrate the genotype-calling model with pedigree analysis and examine a sequence of symmetry hypotheses involving the correlated probe signals. The symmetry hypotheses raise novel mathematical issues of parameterization. Using the Bayesian information criterion, we select the best combination of symmetry assumptions. Compared to Affymetrix's software, our model leads to a reduction in no-calls with little sacrifice in overall calling accuracy.

摘要

Affymetrix公司的单核苷酸多态性(SNP)基因分型芯片扩大了基因图谱研究的范围并降低了其成本。由于每个SNP由多个DNA探针进行检测,这些芯片在基因型判定方面带来了有趣的挑战。在有足够多无关个体样本或已知基因型训练样本的情况下,传统聚类方法能较好地区分SNP的三种基因型。本文描述了我们通过构建具有经验推导先验概率的高斯混合模型来改进基因型判定的尝试。这些先验概率稳定了参数估计,并借鉴了在数万个SNP上共同收集的信息。当有来自相关家庭成员的数据时,我们的模型能够捕捉亲属间信号的相关性。基于这些优势,我们将模型应用于在八个家系中63个已基因分型个体上收集的10000个SNP的Affymetrix探针强度数据。我们将基因型判定模型与系谱分析相结合,并检验一系列涉及相关探针信号的对称性假设。这些对称性假设引发了参数化方面新的数学问题。使用贝叶斯信息准则,我们选择对称性假设的最佳组合。与Affymetrix的软件相比,我们的模型在总体判定准确性几乎没有牺牲的情况下,减少了无法判定的情况。

相似文献

1
Bayesian Gaussian Mixture Models for High-Density Genotyping Arrays.用于高密度基因分型阵列的贝叶斯高斯混合模型
J Am Stat Assoc. 2008 Mar 1;103(481):89-100. doi: 10.1198/016214507000000338..
2
Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays.基于动态模型的寡核苷酸微阵列上100K以上单核苷酸多态性(SNP)筛选和基因分型算法
Bioinformatics. 2005 May 1;21(9):1958-63. doi: 10.1093/bioinformatics/bti275. Epub 2005 Jan 18.
3
A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays.一种用于Affymetrix SNP微阵列的多阵列多SNP基因分型算法。
Bioinformatics. 2007 Jun 15;23(12):1459-67. doi: 10.1093/bioinformatics/btm131. Epub 2007 Apr 25.
4
Fast genotyping of known SNPs through approximate k-mer matching.通过近似k-mer匹配对已知单核苷酸多态性进行快速基因分型。
Bioinformatics. 2016 Sep 1;32(17):i538-i544. doi: 10.1093/bioinformatics/btw460.
5
SNiPer: improved SNP genotype calling for Affymetrix 10K GeneChip microarray data.SNiPer:改进对Affymetrix 10K基因芯片微阵列数据的单核苷酸多态性(SNP)基因型分型
BMC Genomics. 2005 Oct 31;6:149. doi: 10.1186/1471-2164-6-149.
6
Integration of Infinium and Axiom SNP array data in the outcrossing species Malus × domestica and causes for seemingly incompatible calls.在杂交物种苹果(Malus × domestica)中整合Infinium和Axiom SNP芯片数据以及看似不兼容调用的原因。
BMC Genomics. 2021 Apr 7;22(1):246. doi: 10.1186/s12864-021-07565-7.
7
A genotype calling algorithm for affymetrix SNP arrays.一种用于Affymetrix SNP阵列的基因型分型算法。
Bioinformatics. 2006 Jan 1;22(1):7-12. doi: 10.1093/bioinformatics/bti741. Epub 2005 Nov 2.
8
Smarter clustering methods for SNP genotype calling.用于单核苷酸多态性(SNP)基因分型的更智能聚类方法。
Bioinformatics. 2008 Dec 1;24(23):2665-71. doi: 10.1093/bioinformatics/btn509. Epub 2008 Sep 29.
9
Automated SNP genotype clustering algorithm to improve data completeness in high-throughput SNP genotyping datasets from custom arrays.用于提高来自定制阵列的高通量单核苷酸多态性(SNP)基因分型数据集中数据完整性的自动化SNP基因型聚类算法。
Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):256-9. doi: 10.1016/S1672-0229(08)60014-5.
10
SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays.SNiPer-HD:通过用于高密度单核苷酸多态性(SNP)阵列的期望最大化算法提高基因型分型准确性。
Bioinformatics. 2007 Jan 1;23(1):57-63. doi: 10.1093/bioinformatics/btl536. Epub 2006 Oct 24.

引用本文的文献

1
Inferring genetic ancestry: opportunities, challenges, and implications.推断遗传血统:机遇、挑战和影响。
Am J Hum Genet. 2010 May 14;86(5):661-73. doi: 10.1016/j.ajhg.2010.03.011.
2
Markov Models for inferring copy number variations from genotype data on Illumina platforms.用于从Illumina平台的基因型数据推断拷贝数变异的马尔可夫模型。
Hum Hered. 2009;68(1):1-22. doi: 10.1159/000210445. Epub 2009 Apr 1.
3
Smarter clustering methods for SNP genotype calling.用于单核苷酸多态性(SNP)基因分型的更智能聚类方法。

本文引用的文献

1
A dictionary model for haplotyping, genotype calling, and association testing.
Genet Epidemiol. 2007 Nov;31(7):672-83. doi: 10.1002/gepi.20232.
2
A genotype calling algorithm for affymetrix SNP arrays.一种用于Affymetrix SNP阵列的基因型分型算法。
Bioinformatics. 2006 Jan 1;22(1):7-12. doi: 10.1093/bioinformatics/bti741. Epub 2005 Nov 2.
3
Comparative linkage analysis and visualization of high-density oligonucleotide SNP array data.高密度寡核苷酸单核苷酸多态性(SNP)阵列数据的比较连锁分析与可视化
Bioinformatics. 2008 Dec 1;24(23):2665-71. doi: 10.1093/bioinformatics/btn509. Epub 2008 Sep 29.
BMC Genet. 2005 Feb 15;6:7. doi: 10.1186/1471-2156-6-7.
4
Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays.基于动态模型的寡核苷酸微阵列上100K以上单核苷酸多态性(SNP)筛选和基因分型算法
Bioinformatics. 2005 May 1;21(9):1958-63. doi: 10.1093/bioinformatics/bti275. Epub 2005 Jan 18.
5
Estimation of genotype error rate using samples with pedigree information--an application on the GeneChip Mapping 10K array.利用系谱信息样本估计基因型错误率——在基因芯片Mapping 10K阵列上的应用
Genomics. 2004 Oct;84(4):623-30. doi: 10.1016/j.ygeno.2004.05.003.
6
Detect and adjust for population stratification in population-based association study using genomic control markers: an application of Affymetrix Genechip Human Mapping 10K array.利用基因组对照标记在基于人群的关联研究中检测并校正人群分层:Affymetrix基因芯片人类映射10K阵列的应用
Eur J Hum Genet. 2004 Dec;12(12):1001-6. doi: 10.1038/sj.ejhg.5201273.
7
SNP Chart: an integrated platform for visualization and interpretation of microarray genotyping data.单核苷酸多态性图表:一个用于微阵列基因分型数据可视化和解读的综合平台。
Bioinformatics. 2005 Jan 1;21(1):124-7. doi: 10.1093/bioinformatics/bth470. Epub 2004 Aug 12.
8
Genomewide linkage analysis of bipolar disorder by use of a high-density single-nucleotide-polymorphism (SNP) genotyping assay: a comparison with microsatellite marker assays and finding of significant linkage to chromosome 6q22.利用高密度单核苷酸多态性(SNP)基因分型检测对双相情感障碍进行全基因组连锁分析:与微卫星标记检测的比较及发现与6号染色体q22区域存在显著连锁
Am J Hum Genet. 2004 May;74(5):886-97. doi: 10.1086/420775. Epub 2004 Apr 1.
9
dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data.dChipSNP:基于SNP阵列的杂合性缺失数据的显著性曲线和聚类
Bioinformatics. 2004 May 22;20(8):1233-40. doi: 10.1093/bioinformatics/bth069. Epub 2004 Feb 10.
10
Genotyping of single nucleotide polymorphism using model-based clustering.
Bioinformatics. 2004 Mar 22;20(5):718-26. doi: 10.1093/bioinformatics/btg475. Epub 2004 Jan 29.