Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China.
CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China.
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad014.
Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.
鉴定已知生物活性化合物和新型合成类似物的潜在靶标具有重要意义。由于昂贵且费力的湿实验室实验、生物活性数据的爆炸式增长以及高通量技术的快速发展,基于计算的靶标钓取(TF)已成为一种替代策略。然而,这些 TF 方法基于不同的算法、分子表示和训练数据集,当预测相同的查询分子时,可能会产生不同的结果。这在实际应用中可能会使从业者感到困惑。因此,本研究系统评估了基于靶标和配体-靶标对统计策略的 9 种流行的基于配体的 TF 方法,这将有助于从业者在多种 TF 方法之间做出选择。评估结果表明,SwissTargetPrediction 是产生最可靠预测结果的最佳方法,同时丰富了更多的靶标。高召回相似性集成方法(SEA)能够为更多的化合物找到真正的靶标,而优于其他 TF 方法。因此,SwissTargetPrediction 和 SEA 可以被认为是未来研究的初步选择方法。此外,结果表明 k=5 是实验候选靶标数量的最佳选择。最后,提出了一种基于一致投票的新型集成 TF 方法,以提高预测性能。集成 TF 方法的精度优于单个 TF 方法,表明集成 TF 方法可以更有效地在给定的 top-k 阈值内识别真正的靶标。本研究的结果可作为指导从业者在计算药物发现中选择最有效方法的参考。