• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有正交潜在空间的混合自动编码器,用于稳健的群体结构推断。

Hybrid autoencoder with orthogonal latent space for robust population structure inference.

机构信息

Department of Electrical Engineering, ESAT/PSI, KU Leuven, Leuven, Belgium.

Department of Human Genetics, KU Leuven, Leuven, Belgium.

出版信息

Sci Rep. 2023 Feb 14;13(1):2612. doi: 10.1038/s41598-023-28759-x.

DOI:10.1038/s41598-023-28759-x
PMID:36788253
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9929087/
Abstract

Analysis of population structure and genomic ancestry remains an important topic in human genetics and bioinformatics. Commonly used methods require high-quality genotype data to ensure accurate inference. However, in practice, laboratory artifacts and outliers are often present in the data. Moreover, existing methods are typically affected by the presence of related individuals in the dataset. In this work, we propose a novel hybrid method, called SAE-IBS, which combines the strengths of traditional matrix decomposition-based (e.g., principal component analysis) and more recent neural network-based (e.g., autoencoders) solutions. Namely, it yields an orthogonal latent space enhancing dimensionality selection while learning non-linear transformations. The proposed approach achieves higher accuracy than existing methods for projecting poor quality target samples (genotyping errors and missing data) onto a reference ancestry space and generates a robust ancestry space in the presence of relatedness. We introduce a new approach and an accompanying open-source program for robust ancestry inference in the presence of missing data, genotyping errors, and relatedness. The obtained ancestry space allows for non-linear projections and exhibits orthogonality with clearly separable population groups.

摘要

人口结构和基因组血统分析仍然是人类遗传学和生物信息学的重要课题。常用的方法需要高质量的基因型数据,以确保准确的推断。然而,在实践中,数据中常常存在实验室伪影和异常值。此外,现有的方法通常受到数据集内相关个体的存在的影响。在这项工作中,我们提出了一种新的混合方法,称为 SAE-IBS,它结合了传统基于矩阵分解(例如主成分分析)和最近基于神经网络(例如自动编码器)的解决方案的优势。它产生一个正交的潜在空间,增强了维度选择,同时学习非线性变换。与现有方法相比,该方法在将质量较差的目标样本(基因分型错误和缺失数据)投影到参考血统空间上时具有更高的准确性,并在存在亲缘关系的情况下生成稳健的血统空间。我们引入了一种新的方法和一个伴随的开源程序,用于在存在缺失数据、基因分型错误和亲缘关系的情况下进行稳健的血统推断。所得到的血统空间允许进行非线性投影,并表现出与清晰可分离的群体正交性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/21c5698b2f1a/41598_2023_28759_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/14b527177306/41598_2023_28759_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/15ab28a76ef3/41598_2023_28759_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/25ddb357fbe0/41598_2023_28759_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/94028816f955/41598_2023_28759_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/a315a710f41a/41598_2023_28759_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/e2bbf445b6de/41598_2023_28759_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/21c5698b2f1a/41598_2023_28759_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/14b527177306/41598_2023_28759_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/15ab28a76ef3/41598_2023_28759_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/25ddb357fbe0/41598_2023_28759_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/94028816f955/41598_2023_28759_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/a315a710f41a/41598_2023_28759_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/e2bbf445b6de/41598_2023_28759_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fc7/9929087/21c5698b2f1a/41598_2023_28759_Fig7_HTML.jpg

相似文献

1
Hybrid autoencoder with orthogonal latent space for robust population structure inference.具有正交潜在空间的混合自动编码器,用于稳健的群体结构推断。
Sci Rep. 2023 Feb 14;13(1):2612. doi: 10.1038/s41598-023-28759-x.
2
Haplotype and population structure inference using neural networks in whole-genome sequencing data.使用全基因组测序数据中的神经网络进行单倍型和群体结构推断。
Genome Res. 2022 Aug 25;32(8):1542-1552. doi: 10.1101/gr.276813.122.
3
Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness.在存在亲缘关系的情况下,对群体结构进行稳健推断,以进行血统预测和分层校正。
Genet Epidemiol. 2015 May;39(4):276-93. doi: 10.1002/gepi.21896. Epub 2015 Mar 23.
4
Visualizing population structure with variational autoencoders.使用变分自动编码器进行人口结构可视化。
G3 (Bethesda). 2021 Jan 18;11(1). doi: 10.1093/g3journal/jkaa036.
5
Robust genome-wide ancestry inference for heterogeneous datasets: illustrated using the 1,000 genome project with 3D facial images.利用 1000 基因组计划与 3D 面部图像进行异构数据集的全基因组祖先推断:实例研究。
Sci Rep. 2020 Jul 16;10(1):11850. doi: 10.1038/s41598-020-68259-w.
6
GRAF-pop: A Fast Distance-Based Method To Infer Subject Ancestry from Multiple Genotype Datasets Without Principal Components Analysis.GRAF-pop:一种无需主成分分析即可基于距离推断个体祖先的快速方法,适用于多种基因型数据集。
G3 (Bethesda). 2019 Aug 8;9(8):2447-2461. doi: 10.1534/g3.118.200925.
7
Quantitative evaluation of nonlinear methods for population structure visualization and inference.群体结构可视化和推断的非线性方法的定量评估。
G3 (Bethesda). 2022 Aug 25;12(9). doi: 10.1093/g3journal/jkac191.
8
Complex-valued autoencoders.复值自编码器。
Neural Netw. 2012 Sep;33:136-47. doi: 10.1016/j.neunet.2012.04.011. Epub 2012 May 4.
9
Ancestry inference using principal component analysis and spatial analysis: a distance-based analysis to account for population substructure.利用主成分分析和空间分析进行祖籍推断:基于距离的分析方法,用于解释人口亚结构。
BMC Genomics. 2017 Oct 16;18(1):789. doi: 10.1186/s12864-017-4166-8.
10
Predicting chemical ecotoxicity by learning latent space chemical representations.通过学习潜在空间化学表示来预测化学物质的生态毒性。
Environ Int. 2022 May;163:107224. doi: 10.1016/j.envint.2022.107224. Epub 2022 Apr 1.

引用本文的文献

1
Exploring regional aspects of 3D facial variation within European individuals.探讨欧洲个体三维面部变异的区域特征。
Sci Rep. 2023 Mar 6;13(1):3708. doi: 10.1038/s41598-023-30855-x.

本文引用的文献

1
A deep learning framework for characterization of genotype data.深度学习框架用于基因型数据的特征描述。
G3 (Bethesda). 2022 Mar 4;12(3). doi: 10.1093/g3journal/jkac020.
2
Exploring palatal and dental shape variation with 3D shape analysis and geometric deep learning.运用 3D 形态分析和几何深度学习探索腭部和牙齿形态变化。
Orthod Craniofac Res. 2021 Dec;24 Suppl 2(Suppl 2):134-143. doi: 10.1111/ocr.12521. Epub 2021 Aug 24.
3
New neural network classification method for individuals ancestry prediction from SNPs data.基于单核苷酸多态性(SNP)数据预测个体祖先的新型神经网络分类方法。
BioData Min. 2021 Jun 28;14(1):30. doi: 10.1186/s13040-021-00258-7.
4
Visualizing population structure with variational autoencoders.使用变分自动编码器进行人口结构可视化。
G3 (Bethesda). 2021 Jan 18;11(1). doi: 10.1093/g3journal/jkaa036.
5
Creating artificial human genomes using generative neural networks.使用生成式神经网络创建人工人类基因组。
PLoS Genet. 2021 Feb 4;17(2):e1009303. doi: 10.1371/journal.pgen.1009303. eCollection 2021 Feb.
6
A Deep Learning Approach to Population Structure Inference in Inbred Lines of Maize.一种用于推断玉米自交系群体结构的深度学习方法。
Front Genet. 2020 Nov 24;11:543459. doi: 10.3389/fgene.2020.543459. eCollection 2020.
7
Robust genome-wide ancestry inference for heterogeneous datasets: illustrated using the 1,000 genome project with 3D facial images.利用 1000 基因组计划与 3D 面部图像进行异构数据集的全基因组祖先推断:实例研究。
Sci Rep. 2020 Jul 16;10(1):11850. doi: 10.1038/s41598-020-68259-w.
8
The ABCD study: understanding the development of risk for mental and physical health outcomes.ABCD研究:了解身心健康结果风险的发展情况。
Neuropsychopharmacology. 2021 Jan;46(1):131-142. doi: 10.1038/s41386-020-0736-6. Epub 2020 Jun 15.
9
Sparse Convolutional Denoising Autoencoders for Genotype Imputation.稀疏卷积去噪自动编码器在基因型推断中的应用。
Genes (Basel). 2019 Aug 28;10(9):652. doi: 10.3390/genes10090652.
10
Denoising Adversarial Autoencoders.去噪对抗自编码器
IEEE Trans Neural Netw Learn Syst. 2019 Apr;30(4):968-984. doi: 10.1109/TNNLS.2018.2852738. Epub 2018 Aug 16.