• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较 Illumina 的 Infinium 全基因组 SNP BeadChips 基因分型算法。

Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips.

机构信息

Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.

出版信息

BMC Bioinformatics. 2011 Mar 8;12:68. doi: 10.1186/1471-2105-12-68.

DOI:10.1186/1471-2105-12-68
PMID:21385424
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3063825/
Abstract

BACKGROUND

Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study.

RESULTS

In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics.

CONCLUSIONS

CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.

摘要

背景

Illumina 的 Infinium SNP BeadChips 广泛应用于小型和大型遗传研究中。任何分析的基本步骤都是将每个 SNP 的原始等位基因 A 和等位基因 B 强度处理为基因型调用(AA、AB、BB)。为此任务提供了各种利用不同统计模型的算法。我们比较了四种方法(GenCall、Illuminus、GenoSNP 和 CRLMM),一种方法是在已知真实基因型的数据上,另一种方法是在最近发表的全基因组关联研究的数据上。

结果

一般来说,评估的方法之间准确性差异相对较小,尽管 CRLMM 和 GenoSNP 被发现始终优于 GenCall。Illuminus 的性能严重依赖于样本量,随着可用样本数量的增加,无调用率降低,准确性提高。对于 X 染色体 SNP,具有性别依赖模型的方法(Illuminus、CRLMM)比忽略性别信息的方法(GenCall、GenoSNP)表现更好。我们观察到 CRLMM 和 GenoSNP 在调用低次要等位基因频率 SNP 时比 GenCall 或 Illuminus 更准确。这四种方法中的每一种的样本质量指标在标记具有异常信号特征的样本方面具有高度一致性。

结论

CRLMM、GenoSNP 和 GenCall 可以在任何规模的研究中自信地应用,因为它们的性能不受可用样本数量的影响。另一方面,Illuminus 需要更多的样本才能达到可比的准确性水平,不建议在较小的研究(50 或更少的个体)中使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/e3087d66ed97/1471-2105-12-68-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/351e72472347/1471-2105-12-68-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/4feaeac28598/1471-2105-12-68-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/81a6f6de9037/1471-2105-12-68-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/9f67a1264e44/1471-2105-12-68-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/cc0f9191ba80/1471-2105-12-68-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/11ce0c318797/1471-2105-12-68-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/c7abd7106042/1471-2105-12-68-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/e3087d66ed97/1471-2105-12-68-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/351e72472347/1471-2105-12-68-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/4feaeac28598/1471-2105-12-68-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/81a6f6de9037/1471-2105-12-68-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/9f67a1264e44/1471-2105-12-68-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/cc0f9191ba80/1471-2105-12-68-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/11ce0c318797/1471-2105-12-68-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/c7abd7106042/1471-2105-12-68-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b56/3063825/e3087d66ed97/1471-2105-12-68-8.jpg

相似文献

1
Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips.比较 Illumina 的 Infinium 全基因组 SNP BeadChips 基因分型算法。
BMC Bioinformatics. 2011 Mar 8;12:68. doi: 10.1186/1471-2105-12-68.
2
KRLMM: an adaptive genotype calling method for common and low frequency variants.KRLMM:一种针对常见和低频变异的自适应基因型分型方法。
BMC Bioinformatics. 2014 May 23;15:158. doi: 10.1186/1471-2105-15-158.
3
M(3): an improved SNP calling algorithm for Illumina BeadArray data.M(3):一种用于 Illumina BeadArray 数据的 SNP 调用算法的改进。
Bioinformatics. 2012 Feb 1;28(3):358-65. doi: 10.1093/bioinformatics/btr673. Epub 2011 Dec 8.
4
R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips.Illumina Infinium 全基因组基因分型 BeadChips 的 R/Bioconductor 软件。
Bioinformatics. 2009 Oct 1;25(19):2621-3. doi: 10.1093/bioinformatics/btp470. Epub 2009 Aug 6.
5
iCall: a genotype-calling algorithm for rare, low-frequency and common variants on the Illumina exome array.iCall:一种用于 Illumina 外显子组阵列上罕见、低频和常见变异的基因型调用算法。
Bioinformatics. 2014 Jun 15;30(12):1714-20. doi: 10.1093/bioinformatics/btu107. Epub 2014 Feb 23.
6
Comparison of genotype clustering tools with rare variants.比较基因型聚类工具与稀有变异。
BMC Bioinformatics. 2014 Feb 21;15:52. doi: 10.1186/1471-2105-15-52.
7
optiCall: a robust genotype-calling algorithm for rare, low-frequency and common variants.optiCall:一种强大的基因型调用算法,适用于罕见、低频和常见变异。
Bioinformatics. 2012 Jun 15;28(12):1598-603. doi: 10.1093/bioinformatics/bts180. Epub 2012 Apr 12.
8
GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population.GenoSNP:一种无需参考群体的变分贝叶斯样本内单核苷酸多态性基因分型算法。
Bioinformatics. 2008 Oct 1;24(19):2209-14. doi: 10.1093/bioinformatics/btn386. Epub 2008 Jul 24.
9
Quantifying uncertainty in genotype calls.量化基因型调用中的不确定性。
Bioinformatics. 2010 Jan 15;26(2):242-9. doi: 10.1093/bioinformatics/btp624. Epub 2009 Nov 11.
10
Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.同时进行基因型调用和单倍型相位分析可提高全基因组关联研究的基因型准确性,并减少假阳性关联。
Am J Hum Genet. 2009 Dec;85(6):847-61. doi: 10.1016/j.ajhg.2009.11.004.

引用本文的文献

1
Sex Bias in Autoimmunity: New Findings and New Opportunities.自身免疫中的性别偏见:新发现与新机遇
JID Innov. 2025 Jun 20;5(5):100391. doi: 10.1016/j.xjidi.2025.100391. eCollection 2025 Sep.
2
Understanding Mendelian errors in SNP arrays data using a Gochu Asturcelta pig pedigree: genomic alterations, family size and calling errors.利用 Gochu Asturcelta 猪 pedigree 理解 SNP 阵列数据中的 Mendelian 错误:基因组改变、家族大小和调用错误。
Sci Rep. 2022 Nov 16;12(1):19686. doi: 10.1038/s41598-022-24340-0.
3
Gene set enrichment analysis of pathophysiological pathways highlights oxidative stress in psychosis.

本文引用的文献

1
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
2
Saliva-derived DNA performs well in large-scale, high-density single-nucleotide polymorphism microarray studies.唾液来源的 DNA 在大规模、高密度单核苷酸多态性微阵列研究中表现良好。
Cancer Epidemiol Biomarkers Prev. 2010 Mar;19(3):794-8. doi: 10.1158/1055-9965.EPI-09-0812. Epub 2010 Mar 3.
3
Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.
对病理生理途径的基因集富集分析突出了精神分裂症中的氧化应激。
Mol Psychiatry. 2022 Dec;27(12):5135-5143. doi: 10.1038/s41380-022-01779-1. Epub 2022 Sep 21.
4
Establishing analytical validity of BeadChip array genotype data by comparison to whole-genome sequence and standard benchmark datasets.通过与全基因组序列和标准基准数据集进行比较,确立 BeadChip 芯片基因分型数据的分析有效性。
BMC Med Genomics. 2022 Mar 14;15(1):56. doi: 10.1186/s12920-022-01199-8.
5
X chromosome genetic data in a Spanish children cohort, dataset description and analysis pipeline.西班牙儿童队列的 X 染色体遗传数据,数据集描述和分析流程。
Sci Data. 2019 Jul 22;6(1):130. doi: 10.1038/s41597-019-0109-3.
6
Timing and Extent of Inbreeding in African Goats.非洲山羊近亲繁殖的时间和程度
Front Genet. 2019 Jun 4;10:537. doi: 10.3389/fgene.2019.00537. eCollection 2019.
7
SNP genotype calling and quality control for multi-batch-based studies.基于多批次研究的单核苷酸多态性(SNP)基因分型及质量控制
Genes Genomics. 2019 Aug;41(8):927-939. doi: 10.1007/s13258-019-00827-5. Epub 2019 May 6.
8
Effects of X-chromosome Tenomodulin Genetic Variants on Obesity in a Children's Cohort and Implications of the Gene in Adipocyte Metabolism.X 染色体 tenomodulin 遗传变异对儿童队列肥胖的影响及其在脂肪细胞代谢中的基因作用。
Sci Rep. 2019 Mar 8;9(1):3979. doi: 10.1038/s41598-019-40482-0.
9
Misidentification of runs of homozygosity islands in cattle caused by interference with copy number variation or large intermarker distances.由于拷贝数变异或较大的标记间距离的干扰,导致牛的纯合子区域鉴定错误。
Genet Sel Evol. 2018 Aug 22;50(1):43. doi: 10.1186/s12711-018-0414-x.
10
Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa.探讨神经性厌食症中常见、低频及罕见的全基因组变异。
Mol Psychiatry. 2018 May;23(5):1169-1180. doi: 10.1038/mp.2017.88. Epub 2017 Jul 25.
同时进行基因型调用和单倍型相位分析可提高全基因组关联研究的基因型准确性,并减少假阳性关联。
Am J Hum Genet. 2009 Dec;85(6):847-61. doi: 10.1016/j.ajhg.2009.11.004.
4
Quantifying uncertainty in genotype calls.量化基因型调用中的不确定性。
Bioinformatics. 2010 Jan 15;26(2):242-9. doi: 10.1093/bioinformatics/btp624. Epub 2009 Nov 11.
5
R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips.Illumina Infinium 全基因组基因分型 BeadChips 的 R/Bioconductor 软件。
Bioinformatics. 2009 Oct 1;25(19):2621-3. doi: 10.1093/bioinformatics/btp470. Epub 2009 Aug 6.
6
Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20.全基因组关联研究在12号和20号染色体上发现新的多发性硬化症易感基因座。
Nat Genet. 2009 Jul;41(7):824-8. doi: 10.1038/ng.396. Epub 2009 Jun 14.
7
Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs.单核苷酸多态性(SNPs)、常见拷贝数多态性和罕见拷贝数变异(CNVs)的整合基因型分型与关联分析。
Nat Genet. 2008 Oct;40(10):1253-60. doi: 10.1038/ng.237. Epub 2008 Sep 7.
8
GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population.GenoSNP:一种无需参考群体的变分贝叶斯样本内单核苷酸多态性基因分型算法。
Bioinformatics. 2008 Oct 1;24(19):2209-14. doi: 10.1093/bioinformatics/btn386. Epub 2008 Jul 24.
9
Validation and extension of an empirical Bayes method for SNP calling on Affymetrix microarrays.验证和扩展基于 Affymetrix 微阵列的 SNP 调用的经验贝叶斯方法。
Genome Biol. 2008 Apr 3;9(4):R63. doi: 10.1186/gb-2008-9-4-r63.
10
A navigator for human genome epidemiology.人类基因组流行病学导航工具。
Nat Genet. 2008 Feb;40(2):124-5. doi: 10.1038/ng0208-124.