基于去噪自动编码器的快速、无需参考的人类基因型推断。

Rapid, Reference-Free human genotype imputation with denoising autoencoders.

机构信息

Scripps Research Translational Institute, Scripps Research Institute, La Jolla, United States.

Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, United States.

出版信息

Elife. 2022 Sep 23;11:e75600. doi: 10.7554/eLife.75600.

DOI:10.7554/eLife.75600

PMID:36148981

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9555874/

Abstract

Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools.

摘要

基因型推断是群体遗传学的基础工具。标准的统计推断方法依赖于大型全基因组测序参考面板的共定位、强大的计算环境和潜在敏感的遗传研究数据。这导致了获取先进推断技术的计算资源和隐私风险障碍。此外，当前统计方法的准确性已知在低和复杂连锁不平衡区域会降低。基于人工神经网络的推断方法可以通过在易于移植的推理模型中编码复杂的基因型关系来克服这些限制。在这里，我们展示了一种基于自动编码器的基因型推断方法，该方法使用了一个大型的、常用的参考面板，跨越了人类 22 号染色体的全部。我们的基于自动编码器的基因型推断策略在整个等位基因频率谱和不同祖先的基因组中实现了更高的推断准确性，同时相对于标准推断工具，推断运行时间至少快了四倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9555874/3b58375d19b4/elife-75600-fig1.jpg

相似文献

Rapid, Reference-Free human genotype imputation with denoising autoencoders.基于去噪自动编码器的快速、无需参考的人类基因型推断。

Elife. 2022 Sep 23;11:e75600. doi: 10.7554/eLife.75600.

A multi-ethnic reference panel to impute HLA classical and non-classical class I alleles in admixed samples: Testing imputation accuracy in an admixed sample from Brazil.用于在混合样本中推断 HLA 经典和非经典 I 类等位基因的多民族参考面板：在巴西的混合样本中测试推断准确性。

HLA. 2024 Jun;103(6):e15543. doi: 10.1111/tan.15543.

Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology.利用深度学习技术对全基因组和复杂基因组区域进行基因型推断的方法。

J Hum Genet. 2024 Oct;69(10):481-486. doi: 10.1038/s10038-023-01213-6. Epub 2024 Jan 15.

Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.两阶段策略使用去噪自动编码器实现稳健的无参考基因型缺失输入基因型的基因型推断。

J Hum Genet. 2024 Oct;69(10):511-518. doi: 10.1038/s10038-024-01261-6. Epub 2024 Jun 25.

Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs.利用数千个特定研究的全基因组序列进行罕见变异基因型填充：对具有成本效益的研究设计的影响。

Eur J Hum Genet. 2015 Jul;23(7):975-83. doi: 10.1038/ejhg.2014.216. Epub 2014 Oct 8.

Comparison of genotype imputation strategies using a combined reference panel for chicken population.利用鸡群体的组合参考面板比较基因型推断策略。

Animal. 2019 Jun;13(6):1119-1126. doi: 10.1017/S1751731118002860. Epub 2018 Oct 29.

Genotype-imputation accuracy across worldwide human populations.全球人类群体中的基因型推断准确性。

Am J Hum Genet. 2009 Feb;84(2):235-50. doi: 10.1016/j.ajhg.2009.01.013.

A rapid and reference-free imputation method for low-cost genotyping platforms.一种快速且无需参考的低成本基因分型平台插补方法。

Sci Rep. 2023 Dec 27;13(1):23083. doi: 10.1038/s41598-023-50086-4.

A generic coalescent-based framework for the selection of a reference panel for imputation.基于泛凝聚的参考面板选择方法用于 imputation。

Genet Epidemiol. 2010 Dec;34(8):773-82. doi: 10.1002/gepi.20505.

Comprehensive evaluation of imputation performance in African Americans.对非裔美国人插补性能的综合评估。

J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.

引用本文的文献

BiU-Net: A Biologically Informed U-Net for Genotype Imputation.BiU-Net：一种用于基因型插补的基于生物学信息的U-Net

Res Sq. 2025 Aug 26:rs.3.rs-6797863. doi: 10.21203/rs.3.rs-6797863/v1.

STICI: Split-Transformer with integrated convolutions for genotype imputation.STICI：用于基因型填充的集成卷积拆分变压器

Nat Commun. 2025 Jan 31;16(1):1218. doi: 10.1038/s41467-025-56273-3.

Deep learning for genomic selection of aquatic animals.用于水生动物基因组选择的深度学习

Mar Life Sci Technol. 2024 Sep 27;6(4):631-650. doi: 10.1007/s42995-024-00252-y. eCollection 2024 Nov.

Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation.经验与估计的插补准确性：优化序列插补的过滤阈值。

Genet Sel Evol. 2024 Nov 15;56(1):72. doi: 10.1186/s12711-024-00942-2.

J Hum Genet. 2024 Oct;69(10):511-518. doi: 10.1038/s10038-024-01261-6. Epub 2024 Jun 25.

Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology.利用深度学习技术对全基因组和复杂基因组区域进行基因型推断的方法。

J Hum Genet. 2024 Oct;69(10):481-486. doi: 10.1038/s10038-023-01213-6. Epub 2024 Jan 15.

A rapid and reference-free imputation method for low-cost genotyping platforms.一种快速且无需参考的低成本基因分型平台插补方法。

Sci Rep. 2023 Dec 27;13(1):23083. doi: 10.1038/s41598-023-50086-4.

本文引用的文献

Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption.使用机器学习和部分同态加密实现快速且可扩展的私密基因型插补

IEEE Access. 2021;9:93097-93110. doi: 10.1109/access.2021.3093005. Epub 2021 Jun 28.

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes.一种用于 1 型糖尿病 HLA 推断和跨种族 MHC 精细定位的深度学习方法。

Nat Commun. 2021 Mar 12;12(1):1639. doi: 10.1038/s41467-021-21975-x.

Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。

Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.美国国立卫生研究院生物医学高级研究与发展局（NHLBI）TOPMed 项目中对 53831 个不同基因组进行测序。

Nature. 2021 Feb;590(7845):290-299. doi: 10.1038/s41586-021-03205-y. Epub 2021 Feb 10.

Genotype imputation using the Positional Burrows Wheeler Transform.基于位置的 Burrows-Wheeler 变换的基因型推断。

PLoS Genet. 2020 Nov 16;16(11):e1009049. doi: 10.1371/journal.pgen.1009049. eCollection 2020 Nov.

A genotype imputation method for de-identified haplotype reference information by using recurrent neural network.基于循环神经网络的匿名单倍型参考信息基因型推断方法。

PLoS Comput Biol. 2020 Oct 1;16(10):e1008207. doi: 10.1371/journal.pcbi.1008207. eCollection 2020 Oct.

Deep learning on image denoising: An overview.基于深度学习的图像去噪技术综述。

Neural Netw. 2020 Nov;131:251-275. doi: 10.1016/j.neunet.2020.07.025. Epub 2020 Aug 6.

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations.超过 10 万 NHLBI 转化医学精准医学（TOPMed）联盟全基因组序列的使用提高了混合非裔和西班牙裔/拉丁裔人群中罕见变异关联的推断质量和检测能力。

PLoS Genet. 2019 Dec 23;15(12):e1008500. doi: 10.1371/journal.pgen.1008500. eCollection 2019 Dec.

Artificial intelligence in clinical and genomic diagnostics.人工智能在临床和基因组诊断中的应用。

Genome Med. 2019 Nov 19;11(1):70. doi: 10.1186/s13073-019-0689-8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于去噪自动编码器的快速、无需参考的人类基因型推断。

Rapid, Reference-Free human genotype imputation with denoising autoencoders.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献