School of Life Sciences, Liaoning University, Shenyang, 110036, China.
School of Pharmaceutical Sciences, Liaoning University, Shenyang, 110036, China; Technology Innovation Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Liaoning University, Shenyang, 110036, China; Shenyang Key Laboratory of Computer Simulating and Information Processing of Bio-macromolecules, Liaoning University, Shenyang, 110036, China.
Comput Biol Med. 2022 May;144:105390. doi: 10.1016/j.compbiomed.2022.105390. Epub 2022 Mar 11.
Recently, drug toxicity has become a critical problem with heavy medical and economic burdens. Acquired long QT syndrome (acLQTS) is an acquired cardiac ion channel disease caused by drugs blocking the hERG channel. Therefore, it is necessary to avoid cardiotoxicity in drug design, and computer models have been widely used to fix this predicament. In this study, we collected a hERG inhibitor dataset containing 8671 compounds, and then, these compounds were featurized by traditional molecular fingerprints (including Baseline2D, ECFP4, PropertyFP, and 3DFP) and the newly proposed molecular dynamics fingerprint (MDFP). Subsequently, regression prediction models were established by using four machine learning algorithms based on these fingerprints and the combined multi-dimensional molecular fingerprints (MultiFP). After cross-validation and independent test dataset validation, the results show that the best model was built by the consensus of four algorithms with MultiFP, and this model bests recently published methods in terms of hERG cardiotoxicity prediction with a RMSE of 0.531 and a R of 0.653 on the test dataset. Feature importance analysis and correlation analysis identified some novel structural features and molecular dynamics features that are highly associated with the hERG inhibition of compounds. Our findings provide new insight into multi-dimensional molecular fingerprints and consensus models for hERG cardiotoxicity prediction.
近年来,药物毒性已成为一个严重的问题,给医疗和经济带来了沉重的负担。获得性长 QT 综合征(acLQTS)是一种由药物阻断 hERG 通道引起的获得性心脏离子通道疾病。因此,在药物设计中避免心脏毒性是必要的,计算机模型已被广泛用于解决这一困境。在本研究中,我们收集了一个包含 8671 种化合物的 hERG 抑制剂数据集,然后通过传统分子指纹(包括 Baseline2D、ECFP4、PropertyFP 和 3DFP)和新提出的分子动力学指纹(MDFP)对这些化合物进行特征化。随后,基于这些指纹和多维分子指纹(MultiFP),我们使用四种机器学习算法建立了回归预测模型。经过交叉验证和独立测试数据集验证,结果表明,基于 MultiFP 的四个算法共识构建的最佳模型在 hERG 心脏毒性预测方面优于最近发表的方法,在测试数据集上的 RMSE 为 0.531,R 为 0.653。特征重要性分析和相关性分析确定了一些与化合物 hERG 抑制高度相关的新的结构特征和分子动力学特征。我们的研究结果为 hERG 心脏毒性预测的多维分子指纹和共识模型提供了新的见解。