• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于整合生物学和医学数据的机器学习:原理、实践与机遇

Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities.

作者信息

Zitnik Marinka, Nguyen Francis, Wang Bo, Leskovec Jure, Goldenberg Anna, Hoffman Michael M

机构信息

Department of Computer Science, Stanford University, Stanford, CA, USA.

Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.

出版信息

Inf Fusion. 2019 Oct;50:71-91. doi: 10.1016/j.inffus.2018.09.012. Epub 2018 Sep 21.

DOI:10.1016/j.inffus.2018.09.012
PMID:30467459
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6242341/
Abstract

New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include myriad properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we describe the principles of data integration and discuss current methods and available implementations. We provide examples of successful data integration in biology and medicine. Finally, we discuss current challenges in biomedical integrative methods and our perspective on the future development of the field.

摘要

新技术使人们能够以前所未有的规模和多维度方式研究生物学和人类健康。这些维度包括描述基因组、表观基因组、转录组、微生物组、表型和生活方式的无数特性。然而,没有单一的数据类型能够捕捉与理解诸如疾病等现象相关的所有因素的复杂性。因此,结合多种技术数据的整合方法已成为关键的统计和计算方法。开发此类方法的关键挑战在于识别有效的模型,以提供全面且相关的系统观点。一种理想的方法可以通过利用生物变异多个维度的异构数据来回答生物学或医学问题,识别重要特征并预测结果。在本综述中,我们描述了数据整合的原则,并讨论了当前的方法和可用的实现方式。我们提供了生物学和医学中成功数据整合的示例。最后,我们讨论了生物医学整合方法当前面临的挑战以及我们对该领域未来发展的看法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/c092aff466d8/nihms-1510475-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/ea3a3125e655/nihms-1510475-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/06e7048d2782/nihms-1510475-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/662f60bec479/nihms-1510475-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/62f0f740b9c1/nihms-1510475-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/907d138e2343/nihms-1510475-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/cca282a3a156/nihms-1510475-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/48a5196de22a/nihms-1510475-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/850683e6a196/nihms-1510475-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/0519223a4bec/nihms-1510475-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/c092aff466d8/nihms-1510475-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/ea3a3125e655/nihms-1510475-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/06e7048d2782/nihms-1510475-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/662f60bec479/nihms-1510475-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/62f0f740b9c1/nihms-1510475-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/907d138e2343/nihms-1510475-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/cca282a3a156/nihms-1510475-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/48a5196de22a/nihms-1510475-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/850683e6a196/nihms-1510475-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/0519223a4bec/nihms-1510475-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a310/6242341/c092aff466d8/nihms-1510475-f0010.jpg

相似文献

1
Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities.用于整合生物学和医学数据的机器学习:原理、实践与机遇
Inf Fusion. 2019 Oct;50:71-91. doi: 10.1016/j.inffus.2018.09.012. Epub 2018 Sep 21.
2
Machine Learning and Integrative Analysis of Biomedical Big Data.机器学习与生物医学大数据的综合分析。
Genes (Basel). 2019 Jan 28;10(2):87. doi: 10.3390/genes10020087.
3
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
4
Machine learning: its challenges and opportunities in plant system biology.机器学习:在植物系统生物学中的挑战与机遇。
Appl Microbiol Biotechnol. 2022 May;106(9-10):3507-3530. doi: 10.1007/s00253-022-11963-6. Epub 2022 May 16.
5
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
6
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
Systems biology of asthma and allergic diseases: a multiscale approach.哮喘与过敏性疾病的系统生物学:一种多尺度方法
J Allergy Clin Immunol. 2015 Jan;135(1):31-42. doi: 10.1016/j.jaci.2014.10.015. Epub 2014 Nov 21.
9
10
Integration of Omics and Phenotypic Data for Precision Medicine.组学与表型数据的整合用于精准医学。
Methods Mol Biol. 2022;2486:19-35. doi: 10.1007/978-1-0716-2265-0_2.

引用本文的文献

1
Artificial Intelligence and Chromothripsis.人工智能与染色体碎裂
Methods Mol Biol. 2025;2968:281-289. doi: 10.1007/978-1-0716-4750-9_16.
2
Exploring the role of lipid metabolism related genes and immune microenvironment in periodontitis by integrating machine learning and bioinformatics analysis.通过整合机器学习和生物信息学分析探索脂质代谢相关基因和免疫微环境在牙周炎中的作用。
Sci Rep. 2025 Aug 16;15(1):30008. doi: 10.1038/s41598-025-15330-z.
3
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.

本文引用的文献

1
Cross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously.跨平台归一化可实现微阵列和 RNA-seq 数据上的机器学习模型训练。
Commun Biol. 2023 Feb 25;6(1):222. doi: 10.1038/s42003-023-04588-6.
2
Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome.虚拟 ChIP-seq:通过学习转录组预测转录因子结合。
Genome Biol. 2022 Jun 10;23(1):126. doi: 10.1186/s13059-022-02690-2.
3
A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types.
多组学数据整合方法的技术综述:从经典统计方法到深度生成方法
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.
4
BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool.生物神经网络:一种基于图神经网络的多组学网络数据分析工具。
ArXiv. 2025 Jul 27:arXiv:2507.20440v1.
5
Evidential deep learning-based drug-target interaction prediction.基于证据深度学习的药物-靶点相互作用预测
Nat Commun. 2025 Jul 26;16(1):6915. doi: 10.1038/s41467-025-62235-6.
6
SaeGraphDTI: drug-target interaction prediction based on sequence attribute extraction and graph neural network.SaeGraphDTI:基于序列属性提取和图神经网络的药物-靶点相互作用预测
BMC Bioinformatics. 2025 Jul 15;26(1):177. doi: 10.1186/s12859-025-06195-0.
7
Multi-omics decodes host-specific and environmental microbiome interactions in sepsis.多组学解析脓毒症中宿主特异性和环境微生物组的相互作用。
Front Microbiol. 2025 Jun 26;16:1618177. doi: 10.3389/fmicb.2025.1618177. eCollection 2025.
8
Gene regulatory network integration with multi-omics data enhances survival predictions in cancer.基因调控网络与多组学数据的整合提高了癌症生存预测能力。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf315.
9
Integrating multi-omics and machine learning for disease resistance prediction in legumes.整合多组学和机器学习用于豆类抗病性预测
Theor Appl Genet. 2025 Jun 27;138(7):163. doi: 10.1007/s00122-025-04948-2.
10
Multimodal CustOmics: A unified and interpretable multi-task deep learning framework for multimodal integrative data analysis in oncology.多模态定制组学:一种用于肿瘤学多模态整合数据分析的统一且可解释的多任务深度学习框架。
PLoS Comput Biol. 2025 Jun 17;21(6):e1013012. doi: 10.1371/journal.pcbi.1013012. eCollection 2025 Jun.
通过对 164 个人类细胞类型的全自动注释,构建人类功能 DNA 元件的统一百科全书。
Genome Biol. 2019 Aug 28;20(1):180. doi: 10.1186/s13059-019-1784-2.
4
Explainable machine-learning predictions for the prevention of hypoxaemia during surgery.用于预防手术期间低氧血症的可解释机器学习预测。
Nat Biomed Eng. 2018 Oct;2(10):749-760. doi: 10.1038/s41551-018-0304-0. Epub 2018 Oct 10.
5
FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data.FactorNet:一种从核苷酸分辨率序列数据预测细胞类型特异性转录因子结合的深度学习框架。
Methods. 2019 Aug 15;166:40-47. doi: 10.1016/j.ymeth.2019.03.020. Epub 2019 Mar 26.
6
AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification.AIControl:用机器学习替代匹配对照实验可提高 ChIP-seq 峰识别。
Nucleic Acids Res. 2019 Jun 4;47(10):e58. doi: 10.1093/nar/gkz156.
7
Accurate prediction of cell type-specific transcription factor binding.准确预测细胞类型特异性转录因子结合。
Genome Biol. 2019 Jan 10;20(1):9. doi: 10.1186/s13059-018-1614-y.
8
Network enhancement as a general method to denoise weighted biological networks.网络增强作为一种通用的加权生物网络降噪方法。
Nat Commun. 2018 Aug 6;9(1):3108. doi: 10.1038/s41467-018-05469-x.
9
Modeling polypharmacy side effects with graph convolutional networks.基于图卷积网络的药物滥用副作用建模。
Bioinformatics. 2018 Jul 1;34(13):i457-i466. doi: 10.1093/bioinformatics/bty294.
10
DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation.DeepText2GO:利用深度语义文本表示提高大规模蛋白质功能预测。
Methods. 2018 Aug 1;145:82-90. doi: 10.1016/j.ymeth.2018.05.026. Epub 2018 Jun 6.