Ancuceanu Robert, Dinu Mihaela, Neaga Iana, Laszlo Fekete Gyula, Boda Daniel
Department of Pharmaceutical Botany and Cell Biology, Faculty of Pharmacy, 'Carol Davila' University of Medicine and Pharmacy, 020956 Bucharest, Romania.
Department of Public Health and Management, Faculty of Medicine, 'Carol Davila' University of Medicine and Pharmacy, 050463 Bucharest, Romania.
Oncol Lett. 2019 May;17(5):4188-4196. doi: 10.3892/ol.2019.10068. Epub 2019 Feb 25.
SK-MEL-5 is a human melanoma cell line that has been used in various studies to explore new therapies against melanoma in different experiments. Based on this study we report on the development of quantitative structure-activity relationship (QSAR) models able to predict the cytotoxic effect of diverse chemical compounds on this cancer cell line. The dataset of cytotoxic and inactive compounds were downloaded from the PubChem database. It contains the data for all chemical compounds for which cytotoxicity results expressed by GI was recorded. In total 13 blocks of molecular descriptors were computed and used, after appropriate pre-processing in building QSAR models with four machine learning classifiers: Random forest (RF), gradient boosting, support vector machine and random k-nearest neighbors. Among the 186 models reported none had a positive predictive value (PPV) higher than 0.90 in both nested cross-validation and on an external dataset testing, but 7 models had a PPV higher than 0.85 in both evaluations, all seven using the RFs algorithm as a classifier, and topological descriptors, information indices, 2D-autocorrelation descriptors, P-VSA-like descriptors, and edge-adjacency descriptors as sets of features used for classification. The y-scrambling test was associated with considerably worse performance (confirming the non-random character of the models) and the applicability domain was assessed through three different methods.
SK-MEL-5是一种人类黑色素瘤细胞系,已在各种研究中用于探索针对黑色素瘤的新疗法。基于这项研究,我们报告了能够预测多种化合物对该癌细胞系细胞毒性作用的定量构效关系(QSAR)模型的开发。细胞毒性和非活性化合物的数据集从PubChem数据库下载。它包含所有记录了以生长抑制率(GI)表示的细胞毒性结果的化合物的数据。总共计算并使用了13组分子描述符,经过适当预处理后,使用四种机器学习分类器构建QSAR模型:随机森林(RF)、梯度提升、支持向量机和随机k近邻。在报告的186个模型中,在嵌套交叉验证和外部数据集测试中,没有一个模型的阳性预测值(PPV)高于0.90,但有7个模型在两项评估中的PPV均高于0.85,所有七个模型都使用RF算法作为分类器,并使用拓扑描述符、信息指数、二维自相关描述符、类P-VSA描述符和边邻接描述符作为用于分类的特征集。y-打乱检验的性能明显更差(证实了模型的非随机性),并通过三种不同方法评估了适用域。