• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合来自不同基因平台的数据来扩展全基因组关联研究(GWAS)数据的应用。

Extending the use of GWAS data by combining data from different genetic platforms.

作者信息

van Iperen E P A, Hovingh G K, Asselbergs F W, Zwinderman A H

机构信息

Durrer Center for Cardiovascular Research, Netherlands Heart Institute, Utrecht, The Netherlands.

Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands.

出版信息

PLoS One. 2017 Feb 28;12(2):e0172082. doi: 10.1371/journal.pone.0172082. eCollection 2017.

DOI:10.1371/journal.pone.0172082
PMID:28245255
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5330464/
Abstract

BACKGROUND

In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that are not directly genotyped. Imputation methods can also be used to combine and analyse data genotyped on different genotyping arrays. In this study we investigated the imputation quality and efficiency of two different approaches of combining GWAS data from different genotyping platforms. We investigated whether combining data from different platforms before the actual imputation performs better than combining the data from different platforms after imputation.

METHODS

In total 979 unique individuals from the AMC-PAS cohort were genotyped on 3 different platforms. A total of 706 individuals were genotyped on the MetaboChip, a total of 757 individuals were genotyped on the 50K gene-centric Human CVD BeadChip, and a total of 955 individuals were genotyped on the HumanExome chip. A total of 397 individuals were genotyped on all 3 individual platforms. After pre-imputation quality control (QC), Minimac in combination with MaCH was used for the imputation of all samples with the 1,000 genomes reference panel. All imputed markers with an r2 value of <0.3 were excluded in our post-imputation QC.

RESULTS

A total of 397 individuals were genotyped on all three platforms. All three datasets were carefully matched on strand, SNP ID and genomic coordinates. This resulted in a dataset of 979 unique individuals and a total of 258,925 unique markers. A total of 4,117,036 SNPs were available when imputation was performed before merging the three datasets. A total of 3,933,494 SNPs were available when imputation was done on the combined set. Our results suggest that imputation of individual datasets before merging performs slightly better than after combining the different datasets.

CONCLUSIONS

Imputation of datasets genotyped by different platforms before merging generates more SNPs than imputation after putting the datasets together.

摘要

背景

在过去十年中,进行了许多全基因组关联研究(GWAS),发现了单核苷酸多态性(SNP)与各种表型之间的新关联。插补方法在GWAS中被广泛使用。它们有助于将表型与未直接进行基因分型的变异关联起来。插补方法还可用于合并和分析在不同基因分型阵列上进行基因分型的数据。在本研究中,我们调查了两种不同方法组合来自不同基因分型平台的GWAS数据的插补质量和效率。我们研究了在实际插补之前组合来自不同平台的数据是否比在插补之后组合来自不同平台的数据表现更好。

方法

来自AMC-PAS队列的总共979名独特个体在3个不同平台上进行了基因分型。总共706名个体在代谢芯片上进行了基因分型,总共757名个体在以基因为中心的50K人类心血管疾病微珠芯片上进行了基因分型,总共955名个体在人类外显子芯片上进行了基因分型。总共397名个体在所有3个单独平台上进行了基因分型。在预插补质量控制(QC)之后,将Minimac与MaCH结合用于使用千人基因组参考面板对所有样本进行插补。在我们的插补后QC中,排除了r2值<0.3的所有插补标记。

结果

总共397名个体在所有三个平台上进行了基因分型。所有三个数据集在链、SNP ID和基因组坐标上进行了仔细匹配。这产生了一个包含979名独特个体和总共258,925个独特标记的数据集。在合并三个数据集之前进行插补时,共有4,117,036个SNP可用。在合并集上进行插补时,共有3,933,494个SNP可用。我们的结果表明,在合并之前对各个数据集进行插补比在合并不同数据集之后进行插补表现略好。

结论

在合并之前对由不同平台进行基因分型的数据集进行插补比将数据集放在一起之后进行插补产生更多的SNP。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/e46b5ac29be1/pone.0172082.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/e4e22c4c8382/pone.0172082.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/09167736b753/pone.0172082.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/defb86ea2fe9/pone.0172082.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/4d2d1ee0ee17/pone.0172082.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/e46b5ac29be1/pone.0172082.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/e4e22c4c8382/pone.0172082.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/09167736b753/pone.0172082.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/defb86ea2fe9/pone.0172082.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/4d2d1ee0ee17/pone.0172082.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e46/5330464/e46b5ac29be1/pone.0172082.g005.jpg

相似文献

1
Extending the use of GWAS data by combining data from different genetic platforms.通过整合来自不同基因平台的数据来扩展全基因组关联研究(GWAS)数据的应用。
PLoS One. 2017 Feb 28;12(2):e0172082. doi: 10.1371/journal.pone.0172082. eCollection 2017.
2
A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data.一种通过结合单核苷酸多态性(SNP)和外显子芯片数据来提高下一代测序数据中罕见变异插补质量的新策略。
BMC Genomics. 2015 Dec 29;16:1109. doi: 10.1186/s12864-015-2192-y.
3
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。
BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.
4
Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women's Health Initiative.使用来自妇女健康倡议的约 4000 个非洲裔美国人的研究特定参考面板对 Metabochip SNPs 进行基因型推断。
Genet Epidemiol. 2012 Feb;36(2):107-17. doi: 10.1002/gepi.21603.
5
Estimation of Genetic Relationships Between Individuals Across Cohorts and Platforms: Application to Childhood Height.跨队列和平台的个体间遗传关系估计:在儿童身高方面的应用。
Behav Genet. 2015 Sep;45(5):514-28. doi: 10.1007/s10519-015-9725-7. Epub 2015 Jun 3.
6
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.鸡中三种变异检测工具的比较以及从SNP芯片数据到全基因组序列水平的填充准确性评估。
BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2.
7
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.全基因组关联研究中基于基因分型阵列的插补:偏差评估和校正策略。
Hum Genet. 2013 May;132(5):509-22. doi: 10.1007/s00439-013-1266-7. Epub 2013 Jan 22.
8
A new statistic to evaluate imputation reliability.一种评估插补可靠性的新统计量。
PLoS One. 2010 Mar 15;5(3):e9697. doi: 10.1371/journal.pone.0009697.
9
Using family-based imputation in genome-wide association studies with large complex pedigrees: the Framingham Heart Study.在具有大型复杂家系的全基因组关联研究中使用基于家系的内插法:弗雷明汉心脏研究。
PLoS One. 2012;7(12):e51589. doi: 10.1371/journal.pone.0051589. Epub 2012 Dec 17.
10
Comparison of the performance of two commercial genome-wide association study genotyping platforms in Han Chinese samples.两种商业化全基因组关联研究基因分型平台在汉族人群样本中的性能比较。
G3 (Bethesda). 2013 Jan;3(1):23-9. doi: 10.1534/g3.112.004069. Epub 2013 Jan 1.

引用本文的文献

1
SumStatsRehab: an efficient algorithm for GWAS summary statistics assessment and restoration.SumStatsRehab:一种用于 GWAS 汇总统计评估和恢复的高效算法。
BMC Bioinformatics. 2022 Oct 25;23(1):443. doi: 10.1186/s12859-022-04920-7.
2
Integrating Genome and Methylome Data to Identify Candidate DNA Methylation Biomarkers for Pancreatic Cancer Risk.整合基因组和甲基化组数据,以鉴定胰腺癌风险的候选 DNA 甲基化生物标志物。
Cancer Epidemiol Biomarkers Prev. 2021 Nov;30(11):2079-2087. doi: 10.1158/1055-9965.EPI-21-0400. Epub 2021 Sep 8.
3
Genetic determinants of survival in progressive supranuclear palsy: a genome-wide association study.

本文引用的文献

1
Best practices and joint calling of the HumanExome BeadChip: the CHARGE Consortium.人类外显子芯片最佳实践和联合调用:CHARGE 联盟。
PLoS One. 2013 Jul 12;8(7):e68095. doi: 10.1371/journal.pone.0068095. Print 2013.
2
Large-scale association analysis identifies new risk loci for coronary artery disease.大规模关联分析确定了冠心病的新风险位点。
Nat Genet. 2013 Jan;45(1):25-33. doi: 10.1038/ng.2480. Epub 2012 Dec 2.
3
Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways.
进行性核上性麻痹生存的遗传决定因素:全基因组关联研究。
Lancet Neurol. 2021 Feb;20(2):107-116. doi: 10.1016/S1474-4422(20)30394-X. Epub 2020 Dec 17.
4
Genotype imputation performance of three reference panels using African ancestry individuals.三种参考面板在非洲血统个体中的基因型推断性能。
Hum Genet. 2018 Apr;137(4):281-292. doi: 10.1007/s00439-018-1881-4. Epub 2018 Apr 10.
大规模的关联分析确定了影响血糖特征的新基因座,并深入了解了潜在的生物学途径。
Nat Genet. 2012 Sep;44(9):991-1005. doi: 10.1038/ng.2385. Epub 2012 Aug 12.
4
The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.代谢芯片,一种用于代谢、心血管和人体测量特征遗传研究的定制基因分型阵列。
PLoS Genet. 2012;8(8):e1002793. doi: 10.1371/journal.pgen.1002793. Epub 2012 Aug 2.
5
Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.通过预分组实现全基因组关联研究中的快速准确基因型推断。
Nat Genet. 2012 Jul 22;44(8):955-9. doi: 10.1038/ng.2354.
6
Practical Consideration of Genotype Imputation: Sample Size, Window Size, Reference Choice, and Untyped Rate.基因型填充的实际考量:样本量、窗口大小、参考选择及未分型率
Stat Interface. 2011;4(3):339-352. doi: 10.4310/sii.2011.v4.n3.a8.
7
Performance of genotype imputations using data from the 1000 Genomes Project.利用千人基因组计划的数据进行基因型填充的性能。
Hum Hered. 2012;73(1):18-25. doi: 10.1159/000334084. Epub 2011 Dec 30.
8
How to deal with the early GWAS data when imputing and combining different arrays is necessary.在需要进行 imputation 和组合不同数组时,如何处理早期 GWAS 数据。
Eur J Hum Genet. 2012 May;20(5):572-6. doi: 10.1038/ejhg.2011.231. Epub 2011 Dec 21.
9
The effect of genome-wide association scan quality control on imputation outcome for common variants.全基因组关联扫描质量控制对常见变异体的推断结果的影响。
Eur J Hum Genet. 2011 May;19(5):610-4. doi: 10.1038/ejhg.2010.242. Epub 2011 Jan 26.
10
MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.MaCH:利用序列和基因型数据来估计单倍型和未观测基因型。
Genet Epidemiol. 2010 Dec;34(8):816-34. doi: 10.1002/gepi.20533.