• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用表达数量性状位点数据和图嵌入神经网络揭示基因型-表型相互作用。

Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype-phenotype interactions.

作者信息

Guo Xinpeng, Han Jinyu, Song Yafei, Yin Zhilei, Liu Shuaichen, Shang Xuequn

机构信息

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, China.

School of Air and Missile Defense, Air Force Engineering University, Xi'an, China.

出版信息

Front Genet. 2022 Aug 15;13:921775. doi: 10.3389/fgene.2022.921775. eCollection 2022.

DOI:10.3389/fgene.2022.921775
PMID:36046233
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9421127/
Abstract

A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotypephenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics' internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (), the sample size () is often smaller than , hindering the application of machine learning methods in the classification of disease outcomes. To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype-phenotype association analysis in deep learning networks.

摘要

当前生物学的一个核心目标是在基因型和表型之间建立完整的功能联系,即所谓的基因型 - 表型图谱。随着高通量技术的不断发展和测序成本的下降,多组学分析得到了更广泛的应用。虽然这为我们揭示单核苷酸多态性(SNP)、基因和表型之间的相关机制提供了新机会,但多组学仍面临一定挑战,具体如下:1)当样本量足够大时,组学类型的数量往往不足以满足多组学分析的要求;2)每个组学内部的相关性往往不明确,例如基因组学中基因之间的相关性;3)在分析大量性状()时,样本量()往往小于,这阻碍了机器学习方法在疾病结局分类中的应用。为了解决多组学的这些问题并构建一个强大的分类模型,我们提出了一种基于表达数量性状位点(eQTL)数据的图嵌入深度神经网络(G-EDNN),该网络实现了网络层之间的稀疏连接以防止过拟合。同时也考虑了每个组学内部 的相关性,使模型更接近生物学现实。为了验证该方法的能力,我们使用来自基因表达综合数据库(GEO)的GSE28127和GSE95496数据集进行了实验分析,测试了各种神经网络架构,并使用先验数据进行特征选择和图嵌入。结果表明,所提出的方法能够实现高分类准确率和易于解释的特征选择。该方法代表了基因型 - 表型关联分析在深度学习网络中的扩展应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/914ec093fff7/fgene-13-921775-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/cfe68866aec4/fgene-13-921775-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/0a771a893512/fgene-13-921775-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/914ec093fff7/fgene-13-921775-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/cfe68866aec4/fgene-13-921775-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/0a771a893512/fgene-13-921775-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/754c/9421127/914ec093fff7/fgene-13-921775-g003.jpg

相似文献

1
Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype-phenotype interactions.利用表达数量性状位点数据和图嵌入神经网络揭示基因型-表型相互作用。
Front Genet. 2022 Aug 15;13:921775. doi: 10.3389/fgene.2022.921775. eCollection 2022.
2
Linking genotype to phenotype in multi-omics data of small sample.在小样本多组学数据中将基因型与表型联系起来。
BMC Genomics. 2021 Jul 13;22(1):537. doi: 10.1186/s12864-021-07867-w.
3
A multimodal graph neural network framework for cancer molecular subtype classification.一种用于癌症分子亚型分类的多模态图神经网络框架。
BMC Bioinformatics. 2024 Jan 15;25(1):27. doi: 10.1186/s12859-023-05622-4.
4
Local augmented graph neural network for multi-omics cancer prognosis prediction and analysis.用于多组学癌症预后预测与分析的局部增强图神经网络
Methods. 2023 May;213:1-9. doi: 10.1016/j.ymeth.2023.02.011. Epub 2023 Mar 16.
5
Geometric graph neural networks on multi-omics data to predict cancer survival outcomes.基于多组学数据的几何图神经网络预测癌症生存结局
Comput Biol Med. 2023 Sep;163:107117. doi: 10.1016/j.compbiomed.2023.107117. Epub 2023 Jun 9.
6
Integration of multi-omics data using adaptive graph learning and attention mechanism for patient classification and biomarker identification.利用自适应图学习和注意力机制整合多组学数据,用于患者分类和生物标志物识别。
Comput Biol Med. 2023 Sep;164:107303. doi: 10.1016/j.compbiomed.2023.107303. Epub 2023 Aug 2.
7
A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.基于基因表达数据的疾病预后分类和特征选择的图嵌入深度前馈网络。
Bioinformatics. 2018 Nov 1;34(21):3727-3737. doi: 10.1093/bioinformatics/bty429.
8
Graph Neural Networks With Multiple Prior Knowledge for Multi-Omics Data Analysis.基于多种先验知识的图神经网络在多组学数据分析中的应用。
IEEE J Biomed Health Inform. 2023 Sep;27(9):4591-4600. doi: 10.1109/JBHI.2023.3284794. Epub 2023 Sep 6.
9
forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction.forgeNet:一种基于图的深度神经网络模型,使用基于树的集成分类器进行特征图构建。
Bioinformatics. 2020 Jun 1;36(11):3507-3515. doi: 10.1093/bioinformatics/btaa164.
10
Combining Neuroimaging and Omics Datasets for Disease Classification Using Graph Neural Networks.使用图神经网络结合神经影像和组学数据集进行疾病分类
Front Neurosci. 2022 May 23;16:866666. doi: 10.3389/fnins.2022.866666. eCollection 2022.

本文引用的文献

1
Systematic Review of Genotype-Phenotype Correlations in Frasier Syndrome.弗雷泽综合征基因型-表型相关性的系统评价
Kidney Int Rep. 2021 Jul 16;6(10):2585-2593. doi: 10.1016/j.ekir.2021.07.010. eCollection 2021 Oct.
2
Wavelet Screening: a novel approach to analyzing GWAS data.小波筛选:一种分析 GWAS 数据的新方法。
BMC Bioinformatics. 2021 Oct 7;22(1):484. doi: 10.1186/s12859-021-04356-5.
3
Evaluation and comparison of multi-omics data integration methods for cancer subtyping.癌症亚型的多组学数据整合方法的评估与比较。
PLoS Comput Biol. 2021 Aug 12;17(8):e1009224. doi: 10.1371/journal.pcbi.1009224. eCollection 2021 Aug.
4
A unified framework for the integration of multiple hierarchical clusterings or networks from multi-source data.一种用于整合多源数据中多个层次聚类或网络的统一框架。
BMC Bioinformatics. 2021 Aug 4;22(1):392. doi: 10.1186/s12859-021-04303-4.
5
Integration strategies of multi-omics data for machine learning analysis.用于机器学习分析的多组学数据整合策略。
Comput Struct Biotechnol J. 2021 Jun 22;19:3735-3746. doi: 10.1016/j.csbj.2021.06.030. eCollection 2021.
6
Linking genotype to phenotype in multi-omics data of small sample.在小样本多组学数据中将基因型与表型联系起来。
BMC Genomics. 2021 Jul 13;22(1):537. doi: 10.1186/s12864-021-07867-w.
7
Integrative omics of schizophrenia: from genetic determinants to clinical classification and risk prediction.精神分裂症的综合组学:从遗传决定因素到临床分类和风险预测。
Mol Psychiatry. 2022 Jan;27(1):113-126. doi: 10.1038/s41380-021-01201-2. Epub 2021 Jun 30.
8
Pharmacogenetic genotype and phenotype frequencies in a large Danish population-based case-cohort sample.在一个大型丹麦基于人群的病例-对照样本中,药物遗传学基因型和表型频率。
Transl Psychiatry. 2021 May 18;11(1):294. doi: 10.1038/s41398-021-01417-4.
9
Approaches to Integrating Metabolomics and Multi-Omics Data: A Primer.整合代谢组学与多组学数据的方法:入门指南。
Metabolites. 2021 Mar 21;11(3):184. doi: 10.3390/metabo11030184.
10
E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics.E-MAGMA:一种利用全基因组关联研究汇总统计数据来识别风险基因的基于表达定量性状位点的方法。
Bioinformatics. 2021 Aug 25;37(16):2245-2249. doi: 10.1093/bioinformatics/btab115.