• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 DNN 和 Mashup 的衰老相关疾病基因预测。

Gene prediction of aging-related diseases based on DNN and Mashup.

机构信息

School of Information Science and Engineering, Yunnan University, KunMing, 650000, China.

出版信息

BMC Bioinformatics. 2021 Dec 17;22(1):597. doi: 10.1186/s12859-021-04518-5.

DOI:10.1186/s12859-021-04518-5
PMID:34920719
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8680025/
Abstract

BACKGROUND

At present, the bioinformatics research on the relationship between aging-related diseases and genes is mainly through the establishment of a machine learning multi-label model to classify each gene. Most of the existing methods for predicting pathogenic genes mainly rely on specific types of gene features, or directly encode multiple features with different dimensions, use the same encoder to concatenate and predict the final results, which will be subject to many limitations in the applicability of the algorithm. Possible shortcomings of the above include: incomplete coverage of gene features by a single type of biomics data, overfitting of small dimensional datasets by a single encoder, or underfitting of larger dimensional datasets.

METHODS

We use the known gene disease association data and gene descriptors, such as gene ontology terms (GO), protein interaction data (PPI), PathDIP, Kyoto Encyclopedia of genes and genomes Genes (KEGG), etc, as input for deep learning to predict the association between genes and diseases. Our innovation is to use Mashup algorithm to reduce the dimensionality of PPI, GO and other large biological networks, and add new pathway data in KEGG database, and then combine a variety of biological information sources through modular Deep Neural Network (DNN) to predict the genes related to aging diseases.

RESULT AND CONCLUSION

The results show that our algorithm is more effective than the standard neural network algorithm (the Area Under the ROC curve from 0.8795 to 0.9153), gradient enhanced tree classifier and logistic regression classifier. In this paper, we firstly use DNN to learn the similar genes associated with the known diseases from the complex multi-dimensional feature space, and then provide the evidence that the assumed genes are associated with a certain disease.

摘要

背景

目前,衰老相关疾病与基因的生物信息学研究主要是通过建立机器学习多标签模型来对每个基因进行分类。现有的预测致病基因的方法大多依赖于特定类型的基因特征,或者直接用不同维度的多个特征进行编码,用同一个编码器进行拼接和预测最终结果,这将受到算法适用性的许多限制。上述方法可能存在的缺点包括:单一类型的生物信息数据对基因特征的覆盖不完整,单一编码器对小维度数据集的过拟合,或大维度数据集的欠拟合。

方法

我们使用已知的基因疾病关联数据和基因描述符,如基因本体论术语(GO)、蛋白质相互作用数据(PPI)、PathDIP、京都基因与基因组百科全书基因(KEGG)等,作为深度学习的输入来预测基因与疾病之间的关联。我们的创新之处在于使用 Mashup 算法来降低 PPI、GO 等大型生物网络的维度,并添加 KEGG 数据库中的新途径数据,然后通过模块化深度神经网络(DNN)结合多种生物信息源来预测与衰老疾病相关的基因。

结果与结论

结果表明,我们的算法比标准神经网络算法(ROC 曲线下面积从 0.8795 提高到 0.9153)、梯度增强树分类器和逻辑回归分类器更有效。本文首次使用 DNN 从复杂的多维特征空间中学习与已知疾病相关的相似基因,然后提供假设基因与某种疾病相关的证据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/1adb514c87ba/12859_2021_4518_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/41faf7d65347/12859_2021_4518_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/65eb23b53a54/12859_2021_4518_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/500d5f42252d/12859_2021_4518_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/1adb514c87ba/12859_2021_4518_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/41faf7d65347/12859_2021_4518_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/65eb23b53a54/12859_2021_4518_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/500d5f42252d/12859_2021_4518_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/343f/8680025/1adb514c87ba/12859_2021_4518_Fig4_HTML.jpg

相似文献

1
Gene prediction of aging-related diseases based on DNN and Mashup.基于 DNN 和 Mashup 的衰老相关疾病基因预测。
BMC Bioinformatics. 2021 Dec 17;22(1):597. doi: 10.1186/s12859-021-04518-5.
2
Using deep learning to associate human genes with age-related diseases.利用深度学习将人类基因与年龄相关疾病联系起来。
Bioinformatics. 2020 Apr 1;36(7):2202-2208. doi: 10.1093/bioinformatics/btz887.
3
Deep neural learning based protein function prediction.基于深度学习的蛋白质功能预测。
Math Biosci Eng. 2022 Jan 7;19(3):2471-2488. doi: 10.3934/mbe.2022114.
4
Identification of infectious disease-associated host genes using machine learning techniques.利用机器学习技术识别传染病相关宿主基因。
BMC Bioinformatics. 2019 Dec 27;20(1):736. doi: 10.1186/s12859-019-3317-0.
5
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
6
Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction.利用 PPI 网络自相关性在层次多标签分类树中进行基因功能预测。
BMC Bioinformatics. 2013 Sep 26;14:285. doi: 10.1186/1471-2105-14-285.
7
Predicting drug-target interaction network using deep learning model.利用深度学习模型预测药物-靶标相互作用网络。
Comput Biol Chem. 2019 Jun;80:90-101. doi: 10.1016/j.compbiolchem.2019.03.016. Epub 2019 Mar 25.
8
Predicting protein subcellular location with network embedding and enrichment features.利用网络嵌入和富集特征预测蛋白质亚细胞定位。
Biochim Biophys Acta Proteins Proteom. 2020 Oct;1868(10):140477. doi: 10.1016/j.bbapap.2020.140477. Epub 2020 Jun 25.
9
Feature selection may improve deep neural networks for the bioinformatics problems.特征选择可以改进用于生物信息学问题的深度神经网络。
Bioinformatics. 2020 Mar 1;36(5):1542-1552. doi: 10.1093/bioinformatics/btz763.
10
Prognostic prediction of carcinoma by a differential-regulatory-network-embedded deep neural network.基于嵌入差异调控网络的深度神经网络对癌症的预后预测
Comput Biol Chem. 2020 Oct;88:107317. doi: 10.1016/j.compbiolchem.2020.107317. Epub 2020 Jun 24.

引用本文的文献

1
Leveraging machine learning for precision medicine: a predictive model for cognitive impairment in cholestasis patients.利用机器学习实现精准医疗:胆汁淤积症患者认知障碍的预测模型。
BMC Gastroenterol. 2025 Mar 18;25(1):185. doi: 10.1186/s12876-025-03711-7.
2
GeM-LR: Discovering predictive biomarkers for small datasets in vaccine studies.GeM-LR:在疫苗研究中发现小数据集的预测生物标志物。
PLoS Comput Biol. 2024 Nov 14;20(11):e1012581. doi: 10.1371/journal.pcbi.1012581. eCollection 2024 Nov.
3
A risk stratification and prognostic prediction model for lung adenocarcinoma based on aging-related lncRNA.

本文引用的文献

1
CYP1A2 rs762551 polymorphism and risk for amyotrophic lateral sclerosis.细胞色素P450 1A2基因rs762551多态性与肌萎缩侧索硬化症风险
Neurol Sci. 2021 Jan;42(1):175-182. doi: 10.1007/s10072-020-04535-x. Epub 2020 Jun 26.
2
Using deep learning to associate human genes with age-related diseases.利用深度学习将人类基因与年龄相关疾病联系起来。
Bioinformatics. 2020 Apr 1;36(7):2202-2208. doi: 10.1093/bioinformatics/btz887.
3
Efficient utilization on PSSM combining with recurrent neural network for membrane protein types prediction.
基于衰老相关 lncRNA 的肺腺癌风险分层和预后预测模型。
Sci Rep. 2023 Jan 10;13(1):460. doi: 10.1038/s41598-022-26897-2.
4
Deep learning methods may not outperform other machine learning methods on analyzing genomic studies.在分析基因组研究方面,深度学习方法可能并不优于其他机器学习方法。
Front Genet. 2022 Sep 23;13:992070. doi: 10.3389/fgene.2022.992070. eCollection 2022.
利用 PSSM 与递归神经网络相结合提高膜蛋白类型预测效率。
Comput Biol Chem. 2019 Aug;81:9-15. doi: 10.1016/j.compbiolchem.2019.107094. Epub 2019 Aug 8.
4
G-DipC: An Improved Feature Representation Method for Short Sequences to Predict the Type of Cargo in Cell-Penetrating Peptides.G-DipC:一种改进的短序列特征表示方法,用于预测细胞穿透肽中货物的类型。
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):739-747. doi: 10.1109/TCBB.2019.2930993. Epub 2019 Jul 25.
5
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes.语义疾病基因嵌入物(SmuDGE):基于表型的疾病基因优先排序,无需表型。
Bioinformatics. 2018 Sep 1;34(17):i901-i907. doi: 10.1093/bioinformatics/bty559.
6
Prediction of lncRNA-disease associations based on inductive matrix completion.基于归纳矩阵补全的 lncRNA-疾病关联预测。
Bioinformatics. 2018 Oct 1;34(19):3357-3364. doi: 10.1093/bioinformatics/bty327.
7
Macrophage-derived IL-1β/NF-κB signaling mediates parenteral nutrition-associated cholestasis.巨噬细胞衍生的 IL-1β/NF-κB 信号转导介导肠外营养相关性胆汁淤积。
Nat Commun. 2018 Apr 11;9(1):1393. doi: 10.1038/s41467-018-03764-1.
8
Expression and regulation of CYP17A1 and 3β-hydroxysteroid dehydrogenase in cells of the nervous system: Potential effects of vitamin D on brain steroidogenesis.神经系统细胞中 CYP17A1 和 3β-羟甾脱氢酶的表达和调节:维生素 D 对脑类固醇生成的潜在影响。
Neurochem Int. 2018 Feb;113:46-55. doi: 10.1016/j.neuint.2017.11.007. Epub 2017 Nov 21.
9
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.DeepGO:使用深度本体感知分类器从序列和相互作用预测蛋白质功能。
Bioinformatics. 2018 Feb 15;34(4):660-668. doi: 10.1093/bioinformatics/btx624.
10
Early post-traumatic seizures are associated with valproic acid plasma concentrations and UGT1A6/CYP2C9 genetic polymorphisms in patients with severe traumatic brain injury.早期创伤后癫痫与重度创伤性脑损伤患者的丙戊酸血浆浓度和 UGT1A6/CYP2C9 遗传多态性相关。
Scand J Trauma Resusc Emerg Med. 2017 Aug 25;25(1):85. doi: 10.1186/s13049-017-0382-0.