Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne 3004, VIC, Australia.
Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville 3052, VIC, Australia.
J Chem Inf Model. 2020 Jul 27;60(7):3450-3456. doi: 10.1021/acs.jcim.0c00362. Epub 2020 Jul 16.
Development of new potent, safe drugs to treat Mycobacteria has proven to be challenging, with limited hit rates of initial screens restricting subsequent development efforts. Despite significant efforts and the evolution of quantitative structure-activity relationship as well as machine learning-based models for computationally predicting molecule bioactivity, there is an unmet need for efficient and reliable methods for identifying biologically active compounds against Mycobacterium that are also safe for humans. Here we developed mycoCSM, a graph-based signature approach to rapidly identify compounds likely to be active against bacteria from the genus Mycobacterium, or against specific Mycobacteria species. mycoCSM was trained and validated on eight organism-specific and for the first time a general Mycobacteria data set, achieving correlation coefficients of up to 0.89 on cross-validation and 0.88 on independent blind tests, when predicting bioactivity in terms of minimum inhibitory concentration. In addition, we also developed a predictor to identify those compounds likely to penetrate in necrotic tuberculosis foci, which achieved a correlation coefficient of 0.75. Together with a built-in estimator of the maximum tolerated dose in humans, we believe this method will provide a valuable resource to enrich screening libraries with potent, safe molecules. To provide simple guidance in the selection of libraries with favorable anti-Mycobacteria properties, we made mycoCSM freely available online at http://biosig.unimelb.edu.au/myco_csm.
开发新的有效且安全的药物来治疗分枝杆菌已被证明极具挑战性,初始筛选的命中率有限,限制了后续的开发工作。尽管已经做出了巨大的努力,并发展了基于定量构效关系以及基于机器学习的模型来计算预测分子的生物活性,但仍需要有效的、可靠的方法来识别针对分枝杆菌的生物活性化合物,同时这些化合物对人类也应是安全的。在这里,我们开发了一种基于图的签名方法 mycoCSM,用于快速识别可能对分枝杆菌属的细菌或特定分枝杆菌种具有活性的化合物。mycoCSM 是在八个特定生物体的数据上进行训练和验证的,这是首次针对一般分枝杆菌数据集进行训练和验证,在预测最低抑菌浓度方面的生物活性时,交叉验证的相关系数高达 0.89,独立盲测的相关系数高达 0.88。此外,我们还开发了一种预测化合物穿透坏死性结核病灶的能力的预测器,其相关系数为 0.75。结合人类最大耐受剂量的内置估计器,我们相信该方法将为丰富具有强大、安全特性的筛选文库提供有价值的资源。为了在选择具有良好抗分枝杆菌特性的文库时提供简单的指导,我们将 mycoCSM 免费在线提供,网址为 http://biosig.unimelb.edu.au/myco_csm。