• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多队列多组学生物学数据的可解释神经网络进行表型预测。

Phenotype prediction using biologically interpretable neural networks on multi-cohort multi-omics data.

机构信息

Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands.

Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands.

出版信息

NPJ Syst Biol Appl. 2024 Aug 2;10(1):81. doi: 10.1038/s41540-024-00405-w.

DOI:10.1038/s41540-024-00405-w
PMID:39095438
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11297229/
Abstract

Integrating multi-omics data into predictive models has the potential to enhance accuracy, which is essential for precision medicine. In this study, we developed interpretable predictive models for multi-omics data by employing neural networks informed by prior biological knowledge, referred to as visible networks. These neural networks offer insights into the decision-making process and can unveil novel perspectives on the underlying biological mechanisms associated with traits and complex diseases. We tested the performance, interpretability and generalizability for inferring smoking status, subject age and LDL levels using genome-wide RNA expression and CpG methylation data from the blood of the BIOS consortium (four population cohorts, N = 2940). In a cohort-wise cross-validation setting, the consistency of the diagnostic performance and interpretation was assessed. Performance was consistently high for predicting smoking status with an overall mean AUC of 0.95 (95% CI: 0.90-1.00) and interpretation revealed the involvement of well-replicated genes such as AHRR, GPR15 and LRRN3. LDL-level predictions were only generalized in a single cohort with an R of 0.07 (95% CI: 0.05-0.08). Age was inferred with a mean error of 5.16 (95% CI: 3.97-6.35) years with the genes COL11A2, AFAP1, OTUD7A, PTPRN2, ADARB2 and CD34 consistently predictive. For both regression tasks, we found that using multi-omics networks improved performance, stability and generalizability compared to interpretable single omic networks. We believe that visible neural networks have great potential for multi-omics analysis; they combine multi-omic data elegantly, are interpretable, and generalize well to data from different cohorts.

摘要

将多组学数据整合到预测模型中有可能提高准确性,这对于精准医学至关重要。在这项研究中,我们通过使用受先前生物学知识启发的神经网络(称为可见网络)开发了多组学数据的可解释预测模型。这些神经网络可以深入了解决策过程,并揭示与特征和复杂疾病相关的潜在生物学机制的新视角。我们使用 BIOS 联盟(四个人群队列,N=2940)的血液中的全基因组 RNA 表达和 CpG 甲基化数据,测试了推断吸烟状态、受试者年龄和 LDL 水平的性能、可解释性和泛化能力。在队列间交叉验证设置中,评估了诊断性能和解释的一致性。使用全基因组 RNA 表达和 CpG 甲基化数据推断吸烟状态的性能始终很高,总体平均 AUC 为 0.95(95%CI:0.90-1.00),解释揭示了 AHRR、GPR15 和 LRRN3 等经过充分复制的基因的参与。LDL 水平的预测仅在一个队列中具有一般性,R 为 0.07(95%CI:0.05-0.08)。年龄的推断平均误差为 5.16(95%CI:3.97-6.35),COL11A2、AFAP1、OTUD7A、PTPRN2、ADARB2 和 CD34 等基因始终具有预测性。对于这两个回归任务,我们发现与可解释的单组学网络相比,使用多组学网络可提高性能、稳定性和泛化能力。我们相信可见神经网络在多组学分析中具有很大的潜力;它们优雅地整合了多组学数据,具有可解释性,并且可以很好地推广到来自不同队列的数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/5cb5fbf7aed3/41540_2024_405_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/ec7c9e701dde/41540_2024_405_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/61a3909f7a83/41540_2024_405_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/b40071261bef/41540_2024_405_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/5cb5fbf7aed3/41540_2024_405_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/ec7c9e701dde/41540_2024_405_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/61a3909f7a83/41540_2024_405_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/b40071261bef/41540_2024_405_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8ed/11297229/5cb5fbf7aed3/41540_2024_405_Fig4_HTML.jpg

相似文献

1
Phenotype prediction using biologically interpretable neural networks on multi-cohort multi-omics data.基于多队列多组学生物学数据的可解释神经网络进行表型预测。
NPJ Syst Biol Appl. 2024 Aug 2;10(1):81. doi: 10.1038/s41540-024-00405-w.
2
Prior knowledge-guided multilevel graph neural network for tumor risk prediction and interpretation via multi-omics data integration.基于先验知识引导的多层次图神经网络的多组学生物数据融合肿瘤风险预测与解释
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae184.
3
moSCminer: a cell subtype classification framework based on the attention neural network integrating the single-cell multi-omics dataset on the cloud.moSCminer:一种基于注意力神经网络的细胞亚型分类框架,集成了基于云的单细胞多组学数据集。
PeerJ. 2024 Feb 26;12:e17006. doi: 10.7717/peerj.17006. eCollection 2024.
4
Multi-omics data integration and drug screening of AML cancer using Generative Adversarial Network.基于生成对抗网络的 AML 癌症多组学数据整合与药物筛选
Methods. 2024 Jun;226:138-150. doi: 10.1016/j.ymeth.2024.04.017. Epub 2024 Apr 24.
5
Utility of multi-omics data to inform genomic prediction of heifer fertility traits.多组学数据在预测奶牛繁殖力性状中的应用。
J Anim Sci. 2022 Dec 1;100(12). doi: 10.1093/jas/skac340.
6
Integrating multi-omics data by learning modality invariant representations for improved prediction of overall survival of cancer.通过学习模态不变表示来整合多组学数据,以提高癌症总体生存预测的准确性。
Methods. 2021 May;189:74-85. doi: 10.1016/j.ymeth.2020.07.008. Epub 2020 Aug 5.
7
DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing.使用人工神经网络和下一代测序技术基于DNA甲基化的法医年龄预测
Forensic Sci Int Genet. 2017 May;28:225-236. doi: 10.1016/j.fsigen.2017.02.009. Epub 2017 Feb 28.
8
Classifying Breast Cancer Subtypes Using Deep Neural Networks Based on Multi-Omics Data.基于多组学数据的深度学习神经网络分类乳腺癌亚型。
Genes (Basel). 2020 Aug 4;11(8):888. doi: 10.3390/genes11080888.
9
StellarPath: Hierarchical-vertical multi-omics classifier synergizes stable markers and interpretable similarity networks for patient profiling.StellarPath:分层垂直多组学分类器结合稳定标志物和可解释的相似性网络进行患者特征分析。
PLoS Comput Biol. 2024 Apr 12;20(4):e1012022. doi: 10.1371/journal.pcbi.1012022. eCollection 2024 Apr.
10
COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data.漫画:卷积核网络在(多)组学数据上进行可解释的端到端学习。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i76-i85. doi: 10.1093/bioinformatics/btad204.

引用本文的文献

1
Multi-task genomic prediction using gated residual variable selection neural networks.使用门控残差变量选择神经网络的多任务基因组预测
BMC Bioinformatics. 2025 Jul 7;26(1):167. doi: 10.1186/s12859-025-06188-z.
2
Harnessing Machine Learning, a Subset of Artificial Intelligence, for Early Detection and Diagnosis of Type 1 Diabetes: A Systematic Review.利用机器学习(人工智能的一个子集)进行1型糖尿病的早期检测与诊断:一项系统评价
Int J Mol Sci. 2025 Apr 22;26(9):3935. doi: 10.3390/ijms26093935.
3
Strategies to include prior knowledge in omics analysis with deep neural networks.

本文引用的文献

1
Deep neural networks with controlled variable selection for the identification of putative causal genetic variants.具有可控变量选择的深度神经网络用于识别假定的因果基因变异。
Nat Mach Intell. 2022 Sep;4(9):761-771. doi: 10.1038/s42256-022-00525-0. Epub 2022 Sep 15.
2
Reliable interpretability of biology-inspired deep neural networks.受生物学启发的深度神经网络的可靠可解释性。
NPJ Syst Biol Appl. 2023 Oct 10;9(1):50. doi: 10.1038/s41540-023-00310-8.
3
Deep integrative models for large-scale human genomics.大规模人类基因组学的深度综合模型。
在组学分析中利用深度神经网络纳入先验知识的策略。
Patterns (N Y). 2025 Mar 14;6(3):101203. doi: 10.1016/j.patter.2025.101203.
4
Beyond the black box with biologically informed neural networks.超越具有生物信息神经网络的黑箱。
Nat Rev Genet. 2025 Mar 4. doi: 10.1038/s41576-025-00826-1.
5
Designing interpretable deep learning applications for functional genomics: a quantitative analysis.设计可解释的深度学习应用于功能基因组学:一项定量分析。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae449.
6
Reliable interpretability of biology-inspired deep neural networks.受生物学启发的深度神经网络的可靠可解释性。
NPJ Syst Biol Appl. 2023 Oct 10;9(1):50. doi: 10.1038/s41540-023-00310-8.
Nucleic Acids Res. 2023 Jul 7;51(12):e67. doi: 10.1093/nar/gkad373.
4
MOMA: a multi-task attention learning algorithm for multi-omics data interpretation and classification.MOMA:一种用于多组学数据解释和分类的多任务注意力学习算法。
Bioinformatics. 2022 Apr 12;38(8):2287-2296. doi: 10.1093/bioinformatics/btac080.
5
Epigenetic modelling of former, current and never smokers.前、现和从不吸烟者的表观遗传建模。
Clin Epigenetics. 2021 Nov 17;13(1):206. doi: 10.1186/s13148-021-01191-6.
6
ParsVNN: parsimony visible neural networks for uncovering cancer-specific and drug-sensitive genes and pathways.ParsVNN:用于揭示癌症特异性和药物敏感基因及通路的简约可见神经网络。
NAR Genom Bioinform. 2021 Oct 27;3(4):lqab097. doi: 10.1093/nargab/lqab097. eCollection 2021 Dec.
7
Biologically informed deep neural network for prostate cancer discovery.基于生物学信息的深度神经网络在前列腺癌诊断中的应用
Nature. 2021 Oct;598(7880):348-352. doi: 10.1038/s41586-021-03922-4. Epub 2021 Sep 22.
8
Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data.深度 GONet:基于基因本体论的可解释深度神经网络,用于从基因表达数据预测表型。
BMC Bioinformatics. 2021 Sep 22;22(Suppl 10):455. doi: 10.1186/s12859-021-04370-7.
9
GenNet framework: interpretable deep learning for predicting phenotypes from genetic data.GenNet 框架:从遗传数据预测表型的可解释深度学习。
Commun Biol. 2021 Sep 17;4(1):1094. doi: 10.1038/s42003-021-02622-z.
10
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.