State Key Laboratory of Chemical Resource Engineering Department of Pharmaceutical Engineering, Beijing University of Chemical Technology , Beijing, P. R. China.
College of Life Science and Technology, Beijing University of Chemical Technology , Beijing, China.
SAR QSAR Environ Res. 2021 Feb;32(2):85-110. doi: 10.1080/1062936X.2020.1862297. Epub 2021 Feb 1.
Tyrosinase is a key rate-limiting enzyme in the process of melanin synthesis, which is closely related to human pigmentation disorders. Tyrosinase inhibitors can down-regulate tyrosinase to effectively reduce melanin synthesis. In this work, we conducted structure-activity relationship (SAR) study on 1097 diverse mushroom tyrosinase inhibitors. We applied five kinds of machine learning methods to develop 15 classification models. Model 5B built by fully connected neural networks and ECFP4 fingerprints achieved the highest prediction accuracy of 91.36% and Matthews correlation coefficient (MCC) of 0.81 on the test set. The applicability domains (AD) of classification models were defined by method. Moreover, we clustered the 1097 inhibitors into eight subsets by K-Means to figure out inhibitors' structural features. In addition, 10 quantitative structure-activity relationship (QSAR) models were constructed by four machine learning methods based on 813 inhibitors. Model 6 J, the best QSAR model, was developed by fully connected neural networks with 50 RDKit descriptors. It resulted in a coefficient of determination ( ) of 0.770 and a root mean squared error (RMSE) of 0.482 on the test set. The AD of Model 6 J was visualized by Williams plot. The models built in this study can be obtained from the authors.
酪氨酸酶是黑色素合成过程中的关键限速酶,与人类色素沉着紊乱密切相关。酪氨酸酶抑制剂可以下调酪氨酸酶,有效减少黑色素的合成。在这项工作中,我们对 1097 种不同的蘑菇酪氨酸酶抑制剂进行了构效关系(SAR)研究。我们应用了五种机器学习方法来开发 15 个分类模型。由全连接神经网络和 ECFP4 指纹构建的模型 5B 在测试集上达到了最高的预测精度 91.36%和马修斯相关系数(MCC)0.81。分类模型的适用域(AD)是通过 方法定义的。此外,我们通过 K-Means 将 1097 种抑制剂聚类为 8 个子集,以找出抑制剂的结构特征。此外,基于 813 种抑制剂,我们通过四种机器学习方法构建了 10 个定量构效关系(QSAR)模型。基于全连接神经网络和 50 个 RDKit 描述符的最佳 QSAR 模型 6J 的决定系数( )为 0.770,测试集的均方根误差(RMSE)为 0.482。模型 6J 的 AD 通过威廉姆斯图进行可视化。本研究中建立的模型可以从作者处获得。