将可解释机器学习应用于计算生物学——新发展的陷阱、建议和机会。

Applying interpretable machine learning in computational biology-pitfalls, recommendations and opportunities for new developments.

机构信息

Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.

Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.

出版信息

Nat Methods. 2024 Aug;21(8):1454-1461. doi: 10.1038/s41592-024-02359-7. Epub 2024 Aug 9.

DOI:10.1038/s41592-024-02359-7

PMID:39122941

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11348280/

Abstract

Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.

摘要

近年来，机器学习的发展使得开发下一代预测模型来解决复杂的计算生物学问题成为可能，从而推动了可解释机器学习（IML）的使用，以揭示生物学见解。然而，在计算生物学中使用 IML 的指南通常还不够完善。我们提供了 IML 方法和评估技术的概述，并讨论了在将 IML 方法应用于计算生物学问题时遇到的常见陷阱。我们还强调了一些开放性问题，特别是在大型语言模型时代，呼吁 IML 和计算生物学研究人员之间进行合作。

相似文献

Applying interpretable machine learning in computational biology-pitfalls, recommendations and opportunities for new developments.

Nat Methods. 2024 Aug;21(8):1454-1461. doi: 10.1038/s41592-024-02359-7. Epub 2024 Aug 9.

Advancing Computational Toxicology by Interpretable Machine Learning.

Environ Sci Technol. 2023 Nov 21;57(46):17690-17706. doi: 10.1021/acs.est.3c00653. Epub 2023 May 24.

Interpretable machine learning for genomics.

Hum Genet. 2022 Sep;141(9):1499-1513. doi: 10.1007/s00439-021-02387-9. Epub 2021 Oct 20.

Next-Generation Machine Learning for Biological Networks.

Cell. 2018 Jun 14;173(7):1581-1592. doi: 10.1016/j.cell.2018.05.015. Epub 2018 Jun 7.

Incorporating Machine Learning into Established Bioinformatics Frameworks.

Int J Mol Sci. 2021 Mar 12;22(6):2903. doi: 10.3390/ijms22062903.

Deep learning for computational biology.

Mol Syst Biol. 2016 Jul 29;12(7):878. doi: 10.15252/msb.20156651.

Opening the Black Box: Interpretable Machine Learning for Geneticists.

Trends Genet. 2020 Jun;36(6):442-455. doi: 10.1016/j.tig.2020.03.005. Epub 2020 Apr 17.

Using Drug Expression Profiles and Machine Learning Approach for Drug Repurposing.

Methods Mol Biol. 2019;1903:219-237. doi: 10.1007/978-1-4939-8955-3_13.

Interpretable Machine Learning Techniques in ECG-Based Heart Disease Classification: A Systematic Review.

Diagnostics (Basel). 2022 Dec 29;13(1):111. doi: 10.3390/diagnostics13010111.

Integration of machine learning with computational structural biology of plants.

Biochem J. 2022 Apr 29;479(8):921-928. doi: 10.1042/BCJ20200942.

引用本文的文献

ESM2_AMP: an interpretable framework for protein-protein interactions prediction and biological mechanism discovery.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf434.

Advances in Functional Genomics for Exploring Abiotic Stress Tolerance Mechanisms in Cereals.

Plants (Basel). 2025 Aug 8;14(16):2459. doi: 10.3390/plants14162459.

Visible neural networks for multi-omics integration: a critical review.

Front Artif Intell. 2025 Jul 17;8:1595291. doi: 10.3389/frai.2025.1595291. eCollection 2025.

FR-BINN: Biologically Informed Neural Networks for Enhanced Biomarker Discovery and Pathway Analysis.

Int J Mol Sci. 2025 Jul 11;26(14):6670. doi: 10.3390/ijms26146670.

Predicting fitness in with transcriptional regulatory network-informed interpretable machine learning.

Front Tuberc. 2025;3. doi: 10.3389/ftubr.2025.1500899. Epub 2025 Apr 2.

Perspective on recent developments and challenges in regulatory and systems genomics.

Bioinform Adv. 2025 May 9;5(1):vbaf106. doi: 10.1093/bioadv/vbaf106. eCollection 2025.

Proteome-wide prediction of the mode of inheritance and molecular mechanisms underlying genetic diseases using structural interactomics.

iScience. 2025 Jun 4;28(7):112812. doi: 10.1016/j.isci.2025.112812. eCollection 2025 Jul 18.

Fine-tuning protein language models to understand the functional impact of missense variants.

Comput Struct Biotechnol J. 2025 May 28;27:2199-2207. doi: 10.1016/j.csbj.2025.05.022. eCollection 2025.

scATD: a high-throughput and interpretable framework for single-cell cancer drug resistance prediction and biomarker identification.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf268.

MIMIC: a Python package for simulating, inferring, and predicting microbial community interactions and dynamics.

Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf174.

本文引用的文献

Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians.

Nat Biomed Eng. 2025 Mar;9(3):294-306. doi: 10.1038/s41551-023-01160-9. Epub 2023 Dec 28.

Toward Explainable Artificial Intelligence for Precision Pathology.

Annu Rev Pathol. 2024 Jan 24;19:541-570. doi: 10.1146/annurev-pathmechdis-051222-113147. Epub 2023 Oct 23.

BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences.

Cell Syst. 2023 Jun 21;14(6):525-542.e9. doi: 10.1016/j.cels.2023.05.007.

Transfer learning enables predictions in network biology.

Nature. 2023 Jun;618(7965):616-624. doi: 10.1038/s41586-023-06139-9. Epub 2023 May 31.

Explainable multi-task learning for multi-modality biological data analysis.

Nat Commun. 2023 May 3;14(1):2546. doi: 10.1038/s41467-023-37477-x.

PAUSE: principled feature attribution for unsupervised gene expression analysis.

Genome Biol. 2023 Apr 19;24(1):81. doi: 10.1186/s13059-023-02901-4.

Multi-task learning from multimodal single-cell omics with Matilda.

Nucleic Acids Res. 2023 May 8;51(8):e45. doi: 10.1093/nar/gkad157.

Cell-type-specific prediction of 3D chromatin organization enables high-throughput in silico genetic screening.

Nat Biotechnol. 2023 Aug;41(8):1140-1150. doi: 10.1038/s41587-022-01612-8. Epub 2023 Jan 9.

Interpretable deep learning for chromatin-informed inference of transcriptional programs driven by somatic alterations across cancers.

Nucleic Acids Res. 2022 Oct 28;50(19):10869-10881. doi: 10.1093/nar/gkac881.

Obtaining genetics insights from deep learning via explainable artificial intelligence.

Nat Rev Genet. 2023 Feb;24(2):125-137. doi: 10.1038/s41576-022-00532-2. Epub 2022 Oct 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

将可解释机器学习应用于计算生物学——新发展的陷阱、建议和机会。

Applying interpretable machine learning in computational biology-pitfalls, recommendations and opportunities for new developments.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献