• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单体型估计和全基因组基因分型的准确性会影响复杂生物库中复杂性状的分析。

Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.

机构信息

Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000, Denmark.

The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.

出版信息

Commun Biol. 2023 Jan 26;6(1):101. doi: 10.1038/s42003-023-04477-y.

DOI:10.1038/s42003-023-04477-y
PMID:36697501
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9876938/
Abstract

Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.

摘要

样本招募对于研究联盟、生物库和个人基因组学公司来说需要数年时间,这就需要分批进行基因分型,使用不同的技术。由于基因分型芯片上的标记内容不同,因此整合这些数据集并非易事,其对单倍型估计(相位)和全基因组估计的影响(必要步骤)对于复杂性状分析仍然评估不足。使用 iPSYCH 数据集,包含 130438 个人,分两个阶段在不同的数组上进行基因分型,我们评估了多种相位方法和数据集成协议的相位和插补性能。虽然相位准确性因方法和数据集成协议的选择而异,但插补准确性主要在数据集成协议之间变化。我们在非欧洲血统的样本中发现了插补准确性的衰减,突出了在不同人群中研究复杂性状的挑战。最后,插补错误会偏倚关联测试,降低多基因评分的预测效用。精心优化的数据集成策略可提高复杂生物库中复杂性状分析的准确性和可重复性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/362c5b11f798/42003_2023_4477_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/4fdcd52af122/42003_2023_4477_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/d45ee0a0c758/42003_2023_4477_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/2f5edd2a28a5/42003_2023_4477_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/abb1fcff5a18/42003_2023_4477_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/362c5b11f798/42003_2023_4477_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/4fdcd52af122/42003_2023_4477_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/d45ee0a0c758/42003_2023_4477_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/2f5edd2a28a5/42003_2023_4477_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/abb1fcff5a18/42003_2023_4477_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6231/9876938/362c5b11f798/42003_2023_4477_Fig5_HTML.jpg

相似文献

1
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.单体型估计和全基因组基因分型的准确性会影响复杂生物库中复杂性状的分析。
Commun Biol. 2023 Jan 26;6(1):101. doi: 10.1038/s42003-023-04477-y.
2
A strategy to improve phasing of whole-genome sequenced individuals through integration of familial information from dense genotype panels.一种通过整合来自密集基因型面板的家族信息来改善全基因组测序个体相位的策略。
Genet Sel Evol. 2017 May 16;49(1):46. doi: 10.1186/s12711-017-0321-6.
3
Recombination locations and rates in beef cattle assessed from parent-offspring pairs.通过亲子对评估肉牛的重组位置和速率。
Genet Sel Evol. 2014 May 29;46(1):34. doi: 10.1186/1297-9686-46-34.
4
A comprehensive evaluation of polygenic score and genotype imputation performances of human SNP arrays in diverse populations.多基因评分与人类 SNP 芯片在不同人群中的基因型推断性能的综合评估。
Sci Rep. 2022 Oct 20;12(1):17556. doi: 10.1038/s41598-022-22215-y.
5
High-resolution population-specific recombination rates and their effect on phasing and genotype imputation.高分辨率人群特异性重组率及其对相位和基因型推断的影响。
Eur J Hum Genet. 2021 Apr;29(4):615-624. doi: 10.1038/s41431-020-00768-8. Epub 2020 Nov 28.
6
A cautionary tale of low-pass sequencing and imputation with respect to haplotype accuracy.低通测序和单倍型准确性的插补问题的一个警示性案例。
Genet Sel Evol. 2024 Jan 12;56(1):6. doi: 10.1186/s12711-024-00875-w.
7
Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores.低覆盖度全基因组测序可实现常见变异的精确评估和全基因组多基因评分的计算。
Genome Med. 2019 Nov 26;11(1):74. doi: 10.1186/s13073-019-0682-2.
8
Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance.利用基因型填充整合油菜群体以进行全基因组关联分析和黑胫病抗性的基因组预测。
BMC Genomics. 2025 Mar 4;26(1):215. doi: 10.1186/s12864-025-11250-4.
9
A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.一种结合长程相位和长单倍型推断方法的 SNP 基因型相位推断。
Genet Sel Evol. 2011 Mar 10;43(1):12. doi: 10.1186/1297-9686-43-12.
10
A high-resolution haplotype-resolved Reference panel constructed from the China Kadoorie Biobank Study.基于中国慢性病前瞻性研究构建的高分辨率单体型解析参考面板。
Nucleic Acids Res. 2023 Nov 27;51(21):11770-11782. doi: 10.1093/nar/gkad779.

引用本文的文献

1
Correcting for Genomic Inflation Leads to Loss of Power in Large-Scale Genome-Wide Association Study Meta-Analysis.校正基因组膨胀会导致大规模全基因组关联研究荟萃分析中检验效能的损失。
Genet Epidemiol. 2025 Sep;49(6):e70016. doi: 10.1002/gepi.70016.
2
Establishing Best Practices for Clinical GWAS: Tackling Imputation and Data Quality Challenges.建立临床全基因组关联研究的最佳实践:应对基因填充和数据质量挑战。
Int J Mol Sci. 2025 Jul 3;26(13):6397. doi: 10.3390/ijms26136397.
3
Leveraging haplotype information in heritability estimation and polygenic prediction.

本文引用的文献

1
A Comparison of Ten Polygenic Score Methods for Psychiatric Disorders Applied Across Multiple Cohorts.多种队列研究中精神障碍十种多基因风险评分方法的比较
Biol Psychiatry. 2021 Nov 1;90(9):611-620. doi: 10.1016/j.biopsych.2021.04.018. Epub 2021 May 4.
2
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.
3
Genotype imputation and variability in polygenic risk score estimation.基因型推断与多基因风险评分估计中的变异性。
在遗传力估计和多基因预测中利用单倍型信息。
Nat Commun. 2025 Jan 2;16(1):126. doi: 10.1038/s41467-024-55477-3.
4
Analysis of exonic deletions in a large population study provides novel insights into NRXN1 pathology.一项大规模人群研究中的外显子缺失分析为NRXN1病理学提供了新的见解。
NPJ Genom Med. 2024 Dec 19;9(1):67. doi: 10.1038/s41525-024-00450-8.
5
The Evolution of Genetic Variability at the Locus.该基因座遗传变异性的演变。
Genes (Basel). 2024 Jul 3;15(7):878. doi: 10.3390/genes15070878.
6
Views of Genetic Testing for Autism Among Autism Self-Advocates: A Qualitative Study.自闭症自我倡导者对自闭症基因检测的看法:一项定性研究。
AJOB Empir Bioeth. 2024 Oct-Dec;15(4):262-279. doi: 10.1080/23294515.2024.2336903. Epub 2024 Apr 21.
7
100 ancient genomes show repeated population turnovers in Neolithic Denmark.100 个古代基因组显示新石器时代丹麦人口的反复更替。
Nature. 2024 Jan;625(7994):329-337. doi: 10.1038/s41586-023-06862-3. Epub 2024 Jan 10.
8
Impact of Receiving Genetic Diagnoses on Parents' Perceptions of Their Children with Autism and Intellectual Disability.接受基因诊断对父母对其患有自闭症和智力残疾子女认知的影响。
J Autism Dev Disord. 2025 Jan;55(1):284-296. doi: 10.1007/s10803-023-06195-0. Epub 2023 Dec 29.
Genome Med. 2020 Nov 23;12(1):100. doi: 10.1186/s13073-020-00801-x.
4
Genotype imputation using the Positional Burrows Wheeler Transform.基于位置的 Burrows-Wheeler 变换的基因型推断。
PLoS Genet. 2020 Nov 16;16(11):e1009049. doi: 10.1371/journal.pgen.1009049. eCollection 2020 Nov.
5
Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism.大规模外显子组测序研究表明自闭症的神经生物学既有发育性变化也有功能性变化。
Cell. 2020 Feb 6;180(3):568-584.e23. doi: 10.1016/j.cell.2019.12.036. Epub 2020 Jan 23.
6
Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations.超过 10 万 NHLBI 转化医学精准医学(TOPMed)联盟全基因组序列的使用提高了混合非裔和西班牙裔/拉丁裔人群中罕见变异关联的推断质量和检测能力。
PLoS Genet. 2019 Dec 23;15(12):e1008500. doi: 10.1371/journal.pgen.1008500. eCollection 2019 Dec.
7
Accurate, scalable and integrative haplotype estimation.精确、可扩展且综合的单倍型估计。
Nat Commun. 2019 Nov 28;10(1):5436. doi: 10.1038/s41467-019-13225-y.
8
The Personal Genome Project-UK, an open access resource of human multi-omics data.英国个人基因组计划,一个人类多组学数据的开放获取资源。
Sci Data. 2019 Oct 31;6(1):257. doi: 10.1038/s41597-019-0205-4.
9
PRSice-2: Polygenic Risk Score software for biobank-scale data.PRSice-2:用于生物库规模数据的多基因风险评分软件。
Gigascience. 2019 Jul 1;8(7). doi: 10.1093/gigascience/giz082.
10
Clinical use of current polygenic risk scores may exacerbate health disparities.现行多基因风险评分的临床应用可能会加剧健康差异。
Nat Genet. 2019 Apr;51(4):584-591. doi: 10.1038/s41588-019-0379-x. Epub 2019 Mar 29.