Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand.
Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand.
J Comput Chem. 2020 Jul 30;41(20):1820-1834. doi: 10.1002/jcc.26223. Epub 2020 May 25.
Hepatitis C virus (HCV) is one of the major causes of liver disease affecting an estimated 170 million people culminating in 300,000 deaths from cirrhosis or liver cancer. NS5B is one of three potential therapeutic targets against HCV (i.e., the other two being NS3/4A and NS5A) that is central to viral replication. In this study, we developed a classification structure-activity relationship (CSAR) model for identifying substructures giving rise to anti-HCV activities among a set of 578 non-redundant compounds. NS5B inhibitors were described by a set of 12 fingerprint descriptors and predictive models were constructed from 100 independent data splits using the random forest algorithm. The modelability (MODI index) of the data set was determined to be robust with a value of 0.88 exceeding established threshold of 0.65. The predictive performance was deduced by the accuracy, sensitivity, specificity, and Matthews correlation coefficient, which was found to be statistically robust (i.e., the former three parameters afforded values in excess of 0.8 while the latter statistical parameter provided a value >0.7). An in-depth analysis of the top 20 important descriptors revealed that aromatic ring and alkyl side chains are important for NS5B inhibition. Finally, the predictive model is deployed as a publicly accessible HCVpred web server (available at http://codes.bio/hcvpred/) that would allow users to predict the biological activity as being active or inactive against HCV NS5B. Thus, the knowledge and web server presented herein can be used in the design of more potent and specific drugs against the HCV NS5B.
丙型肝炎病毒 (HCV) 是导致肝脏疾病的主要原因之一,估计有 1.7 亿人因此受到影响,最终有 30 万人死于肝硬化或肝癌。NS5B 是 HCV 三种潜在治疗靶点之一(另外两种是 NS3/4A 和 NS5A),是病毒复制的核心。在这项研究中,我们开发了一种分类结构-活性关系 (CSAR) 模型,用于识别一组 578 种非冗余化合物中具有抗 HCV 活性的亚结构。NS5B 抑制剂由一组 12 个指纹描述符描述,并使用随机森林算法从 100 个独立的数据分割中构建预测模型。数据集的可建模性 (MODI 指数) 被确定为稳健,值为 0.88,超过了 0.65 的既定阈值。通过准确性、敏感性、特异性和马修斯相关系数来推断预测性能,发现其具有统计学稳健性(即,前三个参数的值超过 0.8,而后者的统计参数提供的值大于 0.7)。对前 20 个重要描述符的深入分析表明,芳环和烷基侧链对 NS5B 抑制很重要。最后,该预测模型被部署为一个可公开访问的 HCVpred 网络服务器(可在 http://codes.bio/hcvpred/ 获得),该服务器允许用户预测化合物对 HCV NS5B 的活性是活性还是非活性。因此,本文提供的知识和网络服务器可用于设计针对 HCV NS5B 的更有效和更特异的药物。