Mol Pharm. 2019 Jun 3;16(6):2605-2615. doi: 10.1021/acs.molpharmaceut.9b00182. Epub 2019 May 3.
Designing highly selective compounds to protein subtypes and developing allosteric modulators targeting them are critical considerations to both drug discovery and mechanism studies for cannabinoid receptors. It is challenging but in demand to have classifiers to identify active ligands from inactive or random compounds and distinguish allosteric modulators from orthosteric ligands. In this study, supervised machine learning classifiers were built for two subtypes of cannabinoid receptors, CB1 and CB2. Three types of features, including molecular descriptors, MACCS fingerprints, and ECFP6 fingerprints, were calculated to evaluate the compound sets from diverse aspects. Deep neural networks, as well as conventional machine learning algorithms including support vector machine, naïve Bayes, logistic regression, and ensemble learning, were applied. Their performances on the classification with different types of features were compared and discussed. According to the receiver operating characteristic curves and the calculated metrics, the advantages and drawbacks of each algorithm were investigated. The feature ranking was followed to help extract useful knowledge about critical molecular properties, substructural keys, and circular fingerprints. The extracted features will then facilitate the research on cannabinoid receptors by providing guidance on preferred properties for compound modification and novel scaffold design. Besides using conventional molecular docking studies for compound virtual screening, machine-learning-based decision-making models provide alternative options. This study can be of value to the application of machine learning in the area of drug discovery and compound development.
设计对蛋白质亚型具有高选择性的化合物,并开发针对这些亚型的变构调节剂,这是大麻素受体药物发现和机制研究的关键考虑因素。从非活性或随机化合物中识别活性配体,并将变构调节剂与原型配体区分开来,这具有挑战性但需求很大。在这项研究中,为两种大麻素受体亚型 CB1 和 CB2 构建了有监督的机器学习分类器。计算了三种类型的特征,包括分子描述符、MACCS 指纹和 ECFP6 指纹,从多个方面评估化合物集。应用了深度神经网络以及传统的机器学习算法,包括支持向量机、朴素贝叶斯、逻辑回归和集成学习。比较和讨论了它们在不同类型特征分类中的性能。根据接收者操作特征曲线和计算的指标,研究了每种算法的优缺点。进行特征排序以帮助提取有关关键分子性质、亚结构键和圆形指纹的有用知识。然后,提取的特征将通过提供有关化合物修饰和新型支架设计的首选性质的指导,促进大麻素受体的研究。除了使用传统的分子对接研究进行化合物虚拟筛选外,基于机器学习的决策模型还提供了替代方案。这项研究对于机器学习在药物发现和化合物开发领域的应用具有价值。