Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
Nat Methods. 2024 Aug;21(8):1454-1461. doi: 10.1038/s41592-024-02359-7. Epub 2024 Aug 9.
Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.
近年来,机器学习的发展使得开发下一代预测模型来解决复杂的计算生物学问题成为可能,从而推动了可解释机器学习(IML)的使用,以揭示生物学见解。然而,在计算生物学中使用 IML 的指南通常还不够完善。我们提供了 IML 方法和评估技术的概述,并讨论了在将 IML 方法应用于计算生物学问题时遇到的常见陷阱。我们还强调了一些开放性问题,特别是在大型语言模型时代,呼吁 IML 和计算生物学研究人员之间进行合作。