Baskin Igor I
Faculty of Physics, M.V. Lomonosov Moscow State University, Moscow, Russian Federation.
Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russian Federation.
Methods Mol Biol. 2018;1800:119-139. doi: 10.1007/978-1-4939-7899-1_5.
Various methods of machine learning, supervised and unsupervised, linear and nonlinear, classification and regression, in combination with various types of molecular descriptors, both "handcrafted" and "data-driven," are considered in the context of their use in computational toxicology. The use of multiple linear regression, variants of naïve Bayes classifier, k-nearest neighbors, support vector machine, decision trees, ensemble learning, random forest, several types of neural networks, and deep learning is the focus of attention of this review. The role of fragment descriptors, graph mining, and graph kernels is highlighted. The application of unsupervised methods, such as Kohonen's self-organizing maps and related approaches, which allow for combining predictions with data analysis and visualization, is also considered. The necessity of applying a wide range of machine learning methods in computational toxicology is underlined.
在计算毒理学的应用背景下,考虑了各种机器学习方法,包括有监督和无监督、线性和非线性、分类和回归方法,以及与各种类型的分子描述符(包括“手工制作”和“数据驱动”的描述符)相结合的情况。本文综述重点关注多元线性回归、朴素贝叶斯分类器变体、k近邻、支持向量机、决策树、集成学习、随机森林、几种类型的神经网络和深度学习的应用。突出了片段描述符、图挖掘和图核的作用。还考虑了无监督方法的应用,如科霍宁自组织映射及相关方法,这些方法可将预测与数据分析和可视化相结合。强调了在计算毒理学中应用广泛的机器学习方法的必要性。