Yu Xiang-Tian, Wang Lu, Zeng Tao
Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy Science, Shanghai, China.
Methods Mol Biol. 2018;1754:183-204. doi: 10.1007/978-1-4939-7717-8_11.
Generally, machine learning includes many in silico methods to transform the principles underlying natural phenomenon to human understanding information, which aim to save human labor, to assist human judge, and to create human knowledge. It should have wide application potential in biological and biomedical studies, especially in the era of big biological data. To look through the application of machine learning along with biological development, this review provides wide cases to introduce the selection of machine learning methods in different practice scenarios involved in the whole biological and biomedical study cycle and further discusses the machine learning strategies for analyzing omics data in some cutting-edge biological studies. Finally, the notes on new challenges for machine learning due to small-sample high-dimension are summarized from the key points of sample unbalance, white box, and causality.
一般来说,机器学习包括许多计算机模拟方法,用于将自然现象背后的原理转化为人类可理解的信息,其目的是节省人力、辅助人类判断并创造人类知识。它在生物学和生物医学研究中应具有广泛的应用潜力,尤其是在生物大数据时代。为了纵观机器学习在生物学发展过程中的应用,本综述提供了大量案例,介绍了在整个生物学和生物医学研究周期中不同实践场景下机器学习方法的选择,并进一步讨论了一些前沿生物学研究中分析组学数据的机器学习策略。最后,从小样本高维度导致的样本不平衡、白盒和因果关系等关键点总结了机器学习面临的新挑战。