Department of Pharmaceutical Sciences, College of Pharmacy, University of Tennessee Health Science Center, Memphis, TN, USA.
Pharmaceutical Sciences Department (College of Pharmacy), Rosalind Franklin University of Medicine and Science, North Chicago, IL, USA.
Mol Divers. 2024 Apr;28(2):497-507. doi: 10.1007/s11030-022-10596-1. Epub 2023 Jan 17.
Fingerprint-based similarity searching is an important strategy for virtual screening in drug discovery. In the present study, we carried out a systematic virtual screening study, followed by the establishment of kernel-based partial least square (KPLS) analysis prediction models for five tyrosine kinase drug targets, C-terminal SRC kinase (CSK), human epidermal growth factor 2 (HER2), and Janus kinases 1, 2, and 3 (JAK1, JAK2, and JAK3), using a dataset of 3688 compounds. These kinases are important drug discovery targets, particularly as HER2 has been validated for the treatment of metastatic breast cancer, JAK inhibitors have been validated for the clinical management of arthritis and autoimmune diseases, and CSK has been found to play an important role in bone remodeling in arthritis. We conducted similarity screenings with the most active molecule for each target in the dataset as a query using eight (8) types of two-dimensional (2D) molecular fingerprints, comprising seven Hashed fingerprints, Linear, Dendritic, Radial, Pairwise, Triplet, Torsion, and MOLSPRINT2D, and one Structural keys fingerprint, MACCS. The top ranked 1% of compounds from each target's similarity screening results was used to set up kernel-based partial least square (KPLS) prediction models, with q values up to 0.8. The best KPLS model for each target was selected based on its predictive ability and boot strapping results and used for prediction. This integrated study approach combining similarity screening with KPLS analysis has a high potential to enhance the accuracy and efficiency of virtual screening and thus improve the drug discovery process.
基于指纹的相似性搜索是药物发现虚拟筛选的重要策略。在本研究中,我们进行了系统的虚拟筛选研究,随后建立了基于核的偏最小二乘(KPLS)分析预测模型,用于五个酪氨酸激酶药物靶点,C 端 SRC 激酶(CSK)、人表皮生长因子 2(HER2)和 Janus 激酶 1、2 和 3(JAK1、JAK2 和 JAK3),使用了包含 3688 个化合物的数据集。这些激酶是重要的药物发现靶点,特别是 HER2 已被验证可用于治疗转移性乳腺癌,JAK 抑制剂已被验证可用于关节炎和自身免疫性疾病的临床治疗,而 CSK 已被发现在关节炎的骨重塑中发挥重要作用。我们使用数据集中每个靶标最活跃的分子作为查询,使用八种(8)种二维(2D)分子指纹进行相似性筛选,包括七种哈希指纹、线性、树突状、径向、成对、三重、扭转和 MOLSPRINT2D,以及一种结构键指纹,MACCS。从每个靶标相似性筛选结果中排名前 1%的化合物用于建立基于核的偏最小二乘(KPLS)预测模型,q 值高达 0.8。根据预测能力和引导抽样结果,选择每个靶标最佳的 KPLS 模型进行预测。这种结合相似性筛选和 KPLS 分析的综合研究方法具有提高虚拟筛选准确性和效率的潜力,从而改善药物发现过程。