Department of Computer Science and Information Engineering, National Taiwan University, No.1 Sec.4, Roosevelt Road, Taipei, Taiwan.
J Chem Inf Model. 2010 Jul 26;50(7):1304-18. doi: 10.1021/ci100081j.
Blockage of the human ether-a-go-go related gene (hERG) potassium ion channel is a major factor related to cardiotoxicity. Hence, drugs binding to this channel have become an important biological end point in side effects screening. A set of 250 structurally diverse compounds screened for hERG activity from the literature was assembled using a set of reliability filters. This data set was used to construct a set of two-state hERG QSAR models. The descriptor pool used to construct the models consisted of 4D-fingerprints generated from the thermodynamic distribution of conformer states available to a molecule, 204 traditional 2D descriptors and 76 3D VolSurf-like descriptors computed using the Molecular Operating Environment (MOE) software. One model is a continuous partial least-squares (PLS) QSAR hERG binding model. Another related model is an optimized binary classification QSAR model that classifies compounds as active or inactive. This binary model achieves 91% accuracy over a large range of molecular diversity spanning the training set. Two external test sets were constructed. One test set is the condensed PubChem bioassay database containing 876 compounds, and the other test set consists of 106 additional compounds found in the literature. Both of the test sets were used to validate the binary QSAR model. The binary QSAR model permits a structural interpretation of possible sources for hERG activity. In particular, the presence of a polar negative group at a distance of 6-8 A from a hydrogen bond donor in a compound is predicted to be a quite structure-specific pharmacophore that increases hERG blockage. Since a data set of high chemical diversity was used to construct the binary model, it is applicable for performing general virtual hERG screening.
人 Ether-a-go-go 相关基因(hERG)钾离子通道的阻塞是与心脏毒性相关的主要因素。因此,与该通道结合的药物已成为副作用筛选的一个重要生物学终点。从文献中筛选出 250 种结构多样的化合物,用于 hERG 活性筛选,采用了一组可靠性筛选。该数据集用于构建一组两态 hERG QSAR 模型。用于构建模型的描述符库由分子构象状态热力学分布生成的 4D-指纹、204 个传统 2D 描述符和 76 个使用分子操作环境(MOE)软件计算的 3D VolSurf 类似描述符组成。一个模型是连续偏最小二乘(PLS)QSAR hERG 结合模型。另一个相关模型是优化的二进制分类 QSAR 模型,可将化合物分类为活性或非活性。该二进制模型在涵盖训练集的大范围分子多样性上实现了 91%的准确率。构建了两个外部测试集。一个测试集是浓缩的 PubChem 生物测定数据库,包含 876 种化合物,另一个测试集由文献中发现的 106 种额外化合物组成。两个测试集均用于验证二进制 QSAR 模型。二进制 QSAR 模型允许对 hERG 活性的可能来源进行结构解释。特别是,在化合物中氢键供体 6-8 A 距离处存在极性负基团,被预测为增加 hERG 阻断的非常结构特异性药效团。由于使用了高化学多样性数据集来构建二进制模型,因此它适用于执行一般虚拟 hERG 筛选。