Department of Computer Science, University College London, London, UK.
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.
不断扩大的生物数据规模和固有的复杂性促使人们越来越多地在生物学中使用机器学习来构建基础生物过程的信息丰富和预测模型。所有机器学习技术都将模型拟合到数据中;然而,具体方法却大不相同,乍一看可能令人困惑。在这篇综述中,我们旨在为读者提供对几种关键机器学习技术的简要介绍,包括最近开发的和广泛使用的涉及深度神经网络的技术。我们描述了不同的技术如何适用于特定类型的生物数据,还讨论了在涉及机器学习的实验中应考虑的一些最佳实践和注意事项。本文还讨论了机器学习方法学的一些新兴方向。