Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China.
J Appl Toxicol. 2023 Oct;43(10):1462-1475. doi: 10.1002/jat.4477. Epub 2023 May 6.
The human ether-à-go-go-related gene (hERG) is associated with drug cardiotoxicity. If the hERG channel is blocked, it will lead to prolonged QT interval and cause sudden death in severe cases. Therefore, it is important to evaluate the hERG-blocking property of compounds in early drug discovery. In this study, a dataset containing 4556 compounds with IC values determined by patch clamp techniques on mammalian lineage cells was collected, and hERG blockers and non-blockers were distinguished according to three single thresholds and two binary thresholds. Four machine learning (ML) algorithms combining four molecular fingerprints and molecular descriptors as well as graph convolutional neural networks (GCNs) were used to construct a series of binary classification models. The results showed that the best models varied for different thresholds. The ML models implemented by support vector machine and random forest performed well based on Morgan fingerprints and molecular descriptors, with AUCs ranging from 0.884 to 0.950. GCN showed superior prediction performance with AUCs above 0.952, which might be related to its direct extraction of molecular features from the original input. Meanwhile, the classification of binary threshold was better than that of single threshold, which could provide us with a more accurate prediction of hERG blockers. At last, the applicability domain for the model was defined, and seven structural alerts that might generate hERG blockage were identified by information gain and substructure frequency analysis. Our work would be beneficial for identifying hERG blockers in chemicals.
人类 Ether-à-go-go 相关基因(hERG)与药物心脏毒性有关。如果 hERG 通道被阻断,会导致 QT 间期延长,并在严重情况下导致猝死。因此,在早期药物发现中评估化合物对 hERG 的阻断特性非常重要。在这项研究中,收集了一个包含 4556 种化合物的数据集,这些化合物的 IC 值是通过哺乳动物谱系细胞的膜片钳技术确定的,并根据三个单阈值和两个双阈值将 hERG 阻断剂和非阻断剂区分开来。使用四种机器学习(ML)算法结合四种分子指纹和分子描述符以及图卷积神经网络(GCN)构建了一系列二进制分类模型。结果表明,不同的阈值下最佳模型不同。基于 Morgan 指纹和分子描述符的支持向量机和随机森林实现的 ML 模型表现良好,AUC 范围为 0.884 至 0.950。GCN 的预测性能优于 AUC 大于 0.952,这可能与其直接从原始输入中提取分子特征有关。同时,双阈值的分类优于单阈值的分类,可以为我们提供更准确的 hERG 阻断剂预测。最后,定义了模型的适用域,并通过信息增益和子结构频率分析确定了可能产生 hERG 阻断的七个结构警报。我们的工作将有助于识别化学物质中的 hERG 阻断剂。