• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于位置的 Burrows-Wheeler 变换的基因型推断。

Genotype imputation using the Positional Burrows Wheeler Transform.

机构信息

Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.

Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland.

出版信息

PLoS Genet. 2020 Nov 16;16(11):e1009049. doi: 10.1371/journal.pgen.1009049. eCollection 2020 Nov.

DOI:10.1371/journal.pgen.1009049
PMID:33196638
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7704051/
Abstract

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has ∼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.

摘要

基因型推断是指使用单倍型参考面板预测个体样本中未观察到的基因型的过程。在过去的 10 年中,参考面板的大小增加了 100 多倍。增加参考面板的大小可以提高低次要等位基因频率标记的准确性,但为推断方法带来了越来越大的计算挑战。这里我们提出了 IMPUTE5,这是一种可以扩展到具有数百万样本的参考面板的基因型推断方法。该方法延续了 IMPUTE2 方法的观察结果,即通过在每个个体推断时使用自定义单倍型子集,可以优化准确性。它通过使用位置 Burrows Wheeler 变换 (PBWT) 选择单倍型来实现快速、准确和内存高效的推断。通过在基因分型标记处使用 PBWT 数据结构,IMPUTE5 确定局部最佳匹配的单倍型和长的同态状态段。然后,该方法使用所选的单倍型作为 IMPUTE 模型中的条件状态。使用 HRC 参考面板(约有 65,000 个单倍型),我们表明 IMPUTE5 比 MINIMAC4 快 30 倍,比 BEAGLE5.1 快 3 倍,并且使用的内存比这两种方法都少。使用模拟参考面板,我们表明 IMPUTE5 与参考面板大小呈次线性缩放。例如,保持推断标记的数量不变,将参考面板的大小从 10,000 个增加到 100 万个单倍型,所需的计算时间不到两倍。随着参考面板的增大,IMPUTE5 能够利用更少的参考单倍型,从而降低计算成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/6cdea07132b2/pgen.1009049.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/a97afc98cf22/pgen.1009049.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/2d3acba5c3a3/pgen.1009049.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/2a34dfe40129/pgen.1009049.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/77fcb16a726f/pgen.1009049.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/6cdea07132b2/pgen.1009049.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/a97afc98cf22/pgen.1009049.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/2d3acba5c3a3/pgen.1009049.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/2a34dfe40129/pgen.1009049.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/77fcb16a726f/pgen.1009049.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eeb5/7704051/6cdea07132b2/pgen.1009049.g005.jpg

相似文献

1
Genotype imputation using the Positional Burrows Wheeler Transform.基于位置的 Burrows-Wheeler 变换的基因型推断。
PLoS Genet. 2020 Nov 16;16(11):e1009049. doi: 10.1371/journal.pgen.1009049. eCollection 2020 Nov.
2
Genotype imputation performance of three reference panels using African ancestry individuals.三种参考面板在非洲血统个体中的基因型推断性能。
Hum Genet. 2018 Apr;137(4):281-292. doi: 10.1007/s00439-018-1881-4. Epub 2018 Apr 10.
3
A One-Penny Imputed Genome from Next-Generation Reference Panels.基于新一代参考面板的单分钱估算基因组。
Am J Hum Genet. 2018 Sep 6;103(3):338-348. doi: 10.1016/j.ajhg.2018.07.015. Epub 2018 Aug 9.
4
Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs.利用数千个特定研究的全基因组序列进行罕见变异基因型填充:对具有成本效益的研究设计的影响。
Eur J Hum Genet. 2015 Jul;23(7):975-83. doi: 10.1038/ejhg.2014.216. Epub 2014 Oct 8.
5
Multi-ethnic Imputation System (MI-System): A genotype imputation server for high-dimensional data.多民族基因分型系统(MI-System):用于高维数据的基因型基因分型服务器。
J Biomed Inform. 2023 Jul;143:104423. doi: 10.1016/j.jbi.2023.104423. Epub 2023 Jun 10.
6
Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels.利用来自分布式参考面板的多组推算基因型提高关联检验效能。
Genet Epidemiol. 2017 Dec;41(8):744-755. doi: 10.1002/gepi.22067. Epub 2017 Sep 1.
7
Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.通过预分组实现全基因组关联研究中的快速准确基因型推断。
Nat Genet. 2012 Jul 22;44(8):955-9. doi: 10.1038/ng.2354.
8
Concordance rate between copy number variants detected using either high- or medium-density single nucleotide polymorphism genotype panels and the potential of imputing copy number variants from flanking high density single nucleotide polymorphism haplotypes in cattle.使用高密度或中密度单核苷酸多态性基因分型面板检测到的拷贝数变异与从牛侧翼高密度单核苷酸多态性单倍型推断拷贝数变异的一致性。
BMC Genomics. 2020 Mar 4;21(1):205. doi: 10.1186/s12864-020-6627-8.
9
A multi-ethnic reference panel to impute HLA classical and non-classical class I alleles in admixed samples: Testing imputation accuracy in an admixed sample from Brazil.用于在混合样本中推断 HLA 经典和非经典 I 类等位基因的多民族参考面板:在巴西的混合样本中测试推断准确性。
HLA. 2024 Jun;103(6):e15543. doi: 10.1111/tan.15543.
10
A comparative analysis of current phasing and imputation software.当前相位分析和插补软件的比较分析。
PLoS One. 2022 Oct 19;17(10):e0260177. doi: 10.1371/journal.pone.0260177. eCollection 2022.

引用本文的文献

1
BiU-Net: A Biologically Informed U-Net for Genotype Imputation.BiU-Net:一种用于基因型插补的基于生物学信息的U-Net
Res Sq. 2025 Aug 26:rs.3.rs-6797863. doi: 10.21203/rs.3.rs-6797863/v1.
2
Molecular profiles of the great obstetrical syndromes reveal common features and dynamic changes in early pregnancy.重大产科综合征的分子特征揭示了早期妊娠的共同特征和动态变化。
Commun Med (Lond). 2025 Aug 25;5(1):369. doi: 10.1038/s43856-025-01103-2.
3
Polygenic Score for Body Mass Index Is Associated with Weight Loss and Lipid Outcomes After Metabolic and Bariatric Surgery.

本文引用的文献

1
Accurate, scalable and integrative haplotype estimation.精确、可扩展且综合的单倍型估计。
Nat Commun. 2019 Nov 28;10(1):5436. doi: 10.1038/s41467-019-13225-y.
2
The UK Biobank resource with deep phenotyping and genomic data.英国生物银行资源库,具有深度表型和基因组数据。
Nature. 2018 Oct;562(7726):203-209. doi: 10.1038/s41586-018-0579-z. Epub 2018 Oct 10.
3
A One-Penny Imputed Genome from Next-Generation Reference Panels.基于新一代参考面板的单分钱估算基因组。
体重指数的多基因评分与代谢和减重手术后的体重减轻及脂质指标相关。
Int J Mol Sci. 2025 Jul 29;26(15):7337. doi: 10.3390/ijms26157337.
4
Unravelling the genetic architecture of cerebral small vessel disease in the context of stroke.在中风背景下解析脑小血管病的遗传结构。
J Cereb Blood Flow Metab. 2025 Aug 6:271678X251362977. doi: 10.1177/0271678X251362977.
5
Noncoding rare variant associations with blood traits in 166,740 UK Biobank genomes.166740例英国生物银行基因组中与血液性状相关的非编码罕见变异
Nat Genet. 2025 Aug 6. doi: 10.1038/s41588-025-02288-x.
6
Parent-of-origin effects on complex traits in up to 236,781 individuals.多达236,781名个体中复杂性状的亲本来源效应。
Nature. 2025 Aug 6. doi: 10.1038/s41586-025-09357-5.
7
Patterns and drivers of 43,617 mosaic chromosomal alterations in blood.血液中43617种镶嵌染色体改变的模式与驱动因素
medRxiv. 2025 Jul 30:2025.07.30.25332451. doi: 10.1101/2025.07.30.25332451.
8
Mapping Genetic Associations With Functional Brain Area Alterations in Schizophrenia and Implications for Cortical Development.绘制精神分裂症中基因关联与功能性脑区改变的图谱及其对皮质发育的影响
Brain Behav. 2025 Jul;15(7):e70688. doi: 10.1002/brb3.70688.
9
Genome-wide study links cardiometabolic factors to cognition via APOA4-APOA5-ZPR1-BUD13 and other loci in rural Indians.全基因组研究表明,在印度农村地区,心脏代谢因素通过APOA4-APOA5-ZPR1-BUD13及其他基因座与认知相关联。
Alzheimers Dement. 2025 Jul;21(7):e70429. doi: 10.1002/alz.70429.
10
Clinical impact of pharmacogenetic risk variants in a large chinese cohort.中国一个大型队列中药物遗传学风险变异的临床影响
Nat Commun. 2025 Jul 9;16(1):6344. doi: 10.1038/s41467-025-61644-x.
Am J Hum Genet. 2018 Sep 6;103(3):338-348. doi: 10.1016/j.ajhg.2018.07.015. Epub 2018 Aug 9.
4
Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology.分析共享,大数据环境下遗传流行病学发现的团队方法。
Nat Genet. 2017 Oct 27;49(11):1560-1563. doi: 10.1038/ng.3968.
5
Reference-based phasing using the Haplotype Reference Consortium panel.使用单倍型参考联盟面板进行基于参考的定相
Nat Genet. 2016 Nov;48(11):1443-1448. doi: 10.1038/ng.3679. Epub 2016 Oct 3.
6
Next-generation genotype imputation service and methods.下一代基因型填充服务和方法。
Nat Genet. 2016 Oct;48(10):1284-1287. doi: 10.1038/ng.3656. Epub 2016 Aug 29.
7
A reference panel of 64,976 haplotypes for genotype imputation.用于基因型插补的64976个单倍型参考面板。
Nat Genet. 2016 Oct;48(10):1279-83. doi: 10.1038/ng.3643. Epub 2016 Aug 22.
8
Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes.大样本量的高效合并模拟和谱系分析
PLoS Comput Biol. 2016 May 4;12(5):e1004842. doi: 10.1371/journal.pcbi.1004842. eCollection 2016 May.
9
Genotype Imputation with Millions of Reference Samples.使用数百万参考样本进行基因型填充
Am J Hum Genet. 2016 Jan 7;98(1):116-26. doi: 10.1016/j.ajhg.2015.11.020.
10
BGT: efficient and flexible genotype query across many samples.BGT:跨多个样本进行高效灵活的基因型查询。
Bioinformatics. 2016 Feb 15;32(4):590-2. doi: 10.1093/bioinformatics/btv613. Epub 2015 Oct 24.