• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全基因组关联研究中基于基因分型阵列的插补:偏差评估和校正策略。

Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.

机构信息

Behavioral Health Epidemiology Program, RTI International, 3040 Cornwallis Road, PO Box 12194, Research Triangle Park, NC 27709-12194, USA.

出版信息

Hum Genet. 2013 May;132(5):509-22. doi: 10.1007/s00439-013-1266-7. Epub 2013 Jan 22.

DOI:10.1007/s00439-013-1266-7
PMID:23334152
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3628082/
Abstract

A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.

摘要

公开分享全基因组关联数据的一个巨大承诺是有可能创建复合对照组。然而,研究通常使用不同的基因分型阵列,而对共同的 SNP 集进行推断表明存在大量偏差:这是一个没有广泛适用解决方案的问题。基于使用不同的基因分型 SNP 集作为输入会产生不同的推断错误,从而导致对照组的偏差的想法,我们检查了以下每种情况发生的程度:(1)基于基因分型 SNP 的并集(即,一个或多个阵列上可用的 SNP)进行推断会导致偏差,表现为推断基因型与任意分配的病例/对照组状态之间的虚假关联(类型 1 错误);(2)基于基因分型 SNP 的交集(即,所有阵列上可用的 SNP)进行推断不会出现这种偏差;(3)基因分型 SNP 集的交集大小会影响推断质量。在欧洲裔美国人和非裔美国人中进行了推断,参考 HapMap 第二阶段和第三阶段的数据。基于 Illumina 1M 和 550v3 阵列上基因分型 SNP 的并集进行推断会导致 0.2%的 SNP 出现虚假关联:每百万 SNP 推断中约有 2000 个假阳性。非常相似的阵列(550v1 与 550v3)之间的偏差仍然存在问题,而不相似的阵列(Illumina 1M 与 Affymetrix 6.0)之间的偏差则更为严重。在所有情况下,基于基因分型 SNP 的交集(即使只有总 SNP 中 30%的 SNP 进行了基因分型)进行推断可以消除这种偏差,同时仍然实现良好的推断质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/7f399ef61624/nihms-437755-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/e36bc7d4d06d/nihms-437755-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/a49a27c8a464/nihms-437755-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/cac926ddb0df/nihms-437755-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/3142f1a7994a/nihms-437755-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/7f399ef61624/nihms-437755-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/e36bc7d4d06d/nihms-437755-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/a49a27c8a464/nihms-437755-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/cac926ddb0df/nihms-437755-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/3142f1a7994a/nihms-437755-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed1/3628082/7f399ef61624/nihms-437755-f0005.jpg

相似文献

1
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.全基因组关联研究中基于基因分型阵列的插补:偏差评估和校正策略。
Hum Genet. 2013 May;132(5):509-22. doi: 10.1007/s00439-013-1266-7. Epub 2013 Jan 22.
2
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。
BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.
3
Genotype imputation for African Americans using data from HapMap phase II versus 1000 genomes projects.利用 HapMap 二期和 1000 基因组计划的数据对非裔美国人进行基因型推断。
Genet Epidemiol. 2012 Jul;36(5):508-16. doi: 10.1002/gepi.21647. Epub 2012 May 29.
4
Comprehensive evaluation of imputation performance in African Americans.对非裔美国人插补性能的综合评估。
J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.
5
Assessment of genotype imputation performance using 1000 Genomes in African American studies.使用 1000 基因组计划在非裔美国人研究中评估基因型推断性能。
PLoS One. 2012;7(11):e50610. doi: 10.1371/journal.pone.0050610. Epub 2012 Nov 30.
6
Performance of genotype imputations using data from the 1000 Genomes Project.利用千人基因组计划的数据进行基因型填充的性能。
Hum Hered. 2012;73(1):18-25. doi: 10.1159/000334084. Epub 2011 Dec 30.
7
Genotype imputation performance of three reference panels using African ancestry individuals.三种参考面板在非洲血统个体中的基因型推断性能。
Hum Genet. 2018 Apr;137(4):281-292. doi: 10.1007/s00439-018-1881-4. Epub 2018 Apr 10.
8
Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women's Health Initiative.使用来自妇女健康倡议的约 4000 个非洲裔美国人的研究特定参考面板对 Metabochip SNPs 进行基因型推断。
Genet Epidemiol. 2012 Feb;36(2):107-17. doi: 10.1002/gepi.21603.
9
Using family-based imputation in genome-wide association studies with large complex pedigrees: the Framingham Heart Study.在具有大型复杂家系的全基因组关联研究中使用基于家系的内插法:弗雷明汉心脏研究。
PLoS One. 2012;7(12):e51589. doi: 10.1371/journal.pone.0051589. Epub 2012 Dec 17.
10
A new statistic to evaluate imputation reliability.一种评估插补可靠性的新统计量。
PLoS One. 2010 Mar 15;5(3):e9697. doi: 10.1371/journal.pone.0009697.

引用本文的文献

1
Variants in the β-globin locus are associated with pneumonia in African American children.β-珠蛋白基因座的变异与非裔美国儿童的肺炎有关。
HGG Adv. 2025 Jan 9;6(1):100374. doi: 10.1016/j.xhgg.2024.100374. Epub 2024 Oct 22.
2
Contribution of common and rare variants to Asian neovascular age-related macular degeneration subtypes.常见和罕见变异在亚洲新生血管性年龄相关性黄斑变性亚型中的作用。
Nat Commun. 2023 Sep 11;14(1):5574. doi: 10.1038/s41467-023-41256-z.
3
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.

本文引用的文献

1
Assessment of genotype imputation performance using 1000 Genomes in African American studies.使用 1000 基因组计划在非裔美国人研究中评估基因型推断性能。
PLoS One. 2012;7(11):e50610. doi: 10.1371/journal.pone.0050610. Epub 2012 Nov 30.
2
Extremely low-coverage sequencing and imputation increases power for genome-wide association studies.极低覆盖度测序和模拟提高了全基因组关联研究的效能。
Nat Genet. 2012 May 20;44(6):631-5. doi: 10.1038/ng.2283.
3
Genotype imputation with thousands of genomes.使用数千份基因组进行基因型推断。
单体型估计和全基因组基因分型的准确性会影响复杂生物库中复杂性状的分析。
Commun Biol. 2023 Jan 26;6(1):101. doi: 10.1038/s42003-023-04477-y.
4
Natural variation of respiration-related traits in plants.植物呼吸相关性状的自然变异。
Plant Physiol. 2023 Apr 3;191(4):2120-2132. doi: 10.1093/plphys/kiac593.
5
GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing.GAWMerge 通过结合基于阵列的基因分型和全基因组测序来扩大 GWAS 的样本量和多样性。
Commun Biol. 2022 Aug 11;5(1):806. doi: 10.1038/s42003-022-03738-6.
6
Improved analyses of GWAS summary statistics by reducing data heterogeneity and errors.通过减少数据异质性和误差来改进 GWAS 汇总统计数据的分析。
Nat Commun. 2021 Dec 8;12(1):7117. doi: 10.1038/s41467-021-27438-7.
7
Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption.使用机器学习和部分同态加密实现快速且可扩展的私密基因型插补
IEEE Access. 2021;9:93097-93110. doi: 10.1109/access.2021.3093005. Epub 2021 Jun 28.
8
Ultrafast homomorphic encryption models enable secure outsourcing of genotype imputation.超快速同态加密模型实现了基因分型插补的安全外包。
Cell Syst. 2021 Nov 17;12(11):1108-1120.e4. doi: 10.1016/j.cels.2021.07.010. Epub 2021 Aug 30.
9
False positive findings during genome-wide association studies with imputation: influence of allele frequency and imputation accuracy.全基因组关联研究中基于 imputation 的假阳性发现:等位基因频率和 imputation 准确性的影响。
Hum Mol Genet. 2021 Dec 17;31(1):146-155. doi: 10.1093/hmg/ddab203.
10
Genome-wide association studies: assessing trait characteristics in model and crop plants.全基因组关联研究:评估模型和作物植物的性状特征。
Cell Mol Life Sci. 2021 Aug;78(15):5743-5754. doi: 10.1007/s00018-021-03868-w. Epub 2021 Jul 1.
G3 (Bethesda). 2011 Nov;1(6):457-70. doi: 10.1534/g3.111.001198. Epub 2011 Nov 1.
4
How to deal with the early GWAS data when imputing and combining different arrays is necessary.在需要进行 imputation 和组合不同数组时,如何处理早期 GWAS 数据。
Eur J Hum Genet. 2012 May;20(5):572-6. doi: 10.1038/ejhg.2011.231. Epub 2011 Dec 21.
5
Accurate and flexible power calculations on the spot: Applications to genomic research.现场准确且灵活的功效计算:在基因组研究中的应用。
Stat Interface. 2011;4(3):353-358. doi: 10.4310/sii.2011.v4.n3.a9.
6
Including additional controls from public databases improves the power of a genome-wide association study.纳入来自公共数据库的额外对照可提高全基因组关联研究的效能。
Hum Hered. 2011;72(1):21-34. doi: 10.1159/000330149. Epub 2011 Aug 17.
7
Artifact due to differential error when cases and controls are imputed from different platforms.由于病例和对照是从不同平台推断出来的,因此存在差异错误导致的伪影。
Hum Genet. 2012 Jan;131(1):111-9. doi: 10.1007/s00439-011-1054-1. Epub 2011 Jul 7.
8
Inclusion of African Americans in genetic studies: what is the barrier?非裔美国人纳入基因研究:障碍是什么?
Am J Epidemiol. 2011 Aug 1;174(3):336-44. doi: 10.1093/aje/kwr084. Epub 2011 Jun 1.
9
The effect of genome-wide association scan quality control on imputation outcome for common variants.全基因组关联扫描质量控制对常见变异体的推断结果的影响。
Eur J Hum Genet. 2011 May;19(5):610-4. doi: 10.1038/ejhg.2010.242. Epub 2011 Jan 26.
10
A comparison of approaches to account for uncertainty in analysis of imputed genotypes.比较分析推断基因型时考虑不确定性的方法。
Genet Epidemiol. 2011 Feb;35(2):102-10. doi: 10.1002/gepi.20552.