Darsaraee M, Kaveh S, Mani-Varnosfaderani A, Neiband M S
Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran.
Department of Chemistry, Payame Noor University (PNU), Tehran, Iran.
J Biomol Struct Dyn. 2024 Oct;42(17):8781-8799. doi: 10.1080/07391102.2023.2248255. Epub 2023 Aug 20.
CC chemokine receptors (CCRs) form a crucial subfamily of G protein-linked receptors that play a distinct role in the onset and progression of various life-threatening diseases. The main aim of this research is to derive general structure-activity relationship (SAR) patterns to describe the selectivity and activity of CCR inhibitors. To this end, a total of 7332 molecules related to the inhibition of CCR1, CCR2, CCR4, and CCR5 were collected from the Binding Database and analyzed using machine learning techniques. A diverse set of 450 molecular descriptors was calculated for each molecule, and the molecules were classified based on their therapeutic targets and activities. The variable importance in the projection (VIP) approach was used to select discriminatory molecular features, and classification models were developed using supervised Kohonen networks (SKN) and counter-propagation artificial neural networks (CPANN). The reliability and predictability of the models were estimated using 10-fold cross-validation, an external validation set, and an applicability domain approach. We were able to identify different sets of molecular descriptors for discriminating between active and inactive molecules and model the selectivity of inhibitors towards different CCRs. The sensitivities of the predictions for the external test set for the SKN models ranged from 0.827-0.873. Finally, the developed classification models were used to screen approximately 2 million random molecules from the PubChem database, with average values for areas under the receiver operating characteristic curves ranging from 0.78-0.96 for SKN models and 0.75-0.89 for CPANN models.Communicated by Ramaswamy H. Sarma.
CC趋化因子受体(CCRs)构成了G蛋白偶联受体的一个关键亚家族,在各种危及生命的疾病的发生和发展中发挥着独特作用。本研究的主要目的是推导通用的构效关系(SAR)模式,以描述CCR抑制剂的选择性和活性。为此,从结合数据库中收集了总共7332个与CCR1、CCR2、CCR4和CCR5抑制相关的分子,并使用机器学习技术进行分析。为每个分子计算了450种不同的分子描述符,并根据其治疗靶点和活性对分子进行分类。采用投影中变量重要性(VIP)方法选择具有区分性的分子特征,并使用监督Kohonen网络(SKN)和反向传播人工神经网络(CPANN)开发分类模型。使用10倍交叉验证、外部验证集和适用域方法评估模型的可靠性和可预测性。我们能够识别区分活性和非活性分子的不同分子描述符集,并对抑制剂对不同CCR的选择性进行建模。SKN模型外部测试集预测的灵敏度范围为0.827 - 0.873。最后,使用开发的分类模型从PubChem数据库中筛选了约200万个随机分子,SKN模型的受试者操作特征曲线下面积平均值范围为0.78 - 0.96,CPANN模型为0.75 - 0.89。由拉马斯瓦米·H·萨尔马传达。