Bonanni Davide, Pinzi Luca, Rastelli Giulio
Department of Life Sciences, University of Modena and Reggio Emilia, Via Campi 103, 41125, Modena, Italy.
J Cheminform. 2022 Nov 8;14(1):77. doi: 10.1186/s13321-022-00647-y.
Prostate cancer is the most common type of cancer in men. The disease presents good survival rates if treated at the early stages. However, the evolution of the disease in its most aggressive variant remains without effective therapeutic answers. Therefore, the identification of novel effective therapeutics is urgently needed. On these premises, we developed a series of machine learning models, based on compounds with reported highly homogeneous cell-based antiproliferative assay data, able to predict the activity of ligands towards the PC-3 and DU-145 prostate cancer cell lines. The data employed in the development of the computational models was finely-tuned according to a series of thresholds for the classification of active/inactive compounds, to the number of features to be implemented, and by using 10 different machine learning algorithms. Models' evaluation allowed us to identify the best combination of activity thresholds and ML algorithms for the classification of active compounds, achieving prediction performances with MCC values above 0.60 for PC-3 and DU-145 cells. Moreover, in silico models based on the combination of PC-3 and DU-145 data were also developed, demonstrating excellent precision performances. Finally, an analysis of the activity annotations reported for the ligands in the curated datasets were conducted, suggesting associations between cellular activity and biological targets that might be explored in the future for the design of more effective prostate cancer antiproliferative agents.
前列腺癌是男性中最常见的癌症类型。如果在早期阶段进行治疗,该疾病具有良好的生存率。然而,其最具侵袭性变体的疾病进展仍然缺乏有效的治疗方法。因此,迫切需要鉴定新的有效治疗方法。在此前提下,我们基于具有高度一致的基于细胞的抗增殖测定数据的化合物开发了一系列机器学习模型,能够预测配体对PC-3和DU-145前列腺癌细胞系的活性。根据一系列用于活性/非活性化合物分类的阈值、要实施的特征数量,并使用10种不同的机器学习算法,对计算模型开发中使用的数据进行了精细调整。模型评估使我们能够确定活性化合物分类的活性阈值和机器学习算法的最佳组合,对于PC-3和DU-145细胞,MCC值高于0.60时实现了预测性能。此外,还开发了基于PC-3和DU-145数据组合的计算机模拟模型,显示出优异的精确性能。最后,对策划数据集中配体报告的活性注释进行了分析,表明细胞活性与生物靶点之间的关联,未来可能会探索这些关联以设计更有效的前列腺癌抗增殖药物。