Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Rd., Jefferson, Arkansas 72079, United States.
J Chem Inf Model. 2012 Jul 23;52(7):1854-64. doi: 10.1021/ci3001698. Epub 2012 Jun 22.
An improved three-dimensional quantitative spectral data-activity relationship (3D-QSDAR) methodology was used to build and validate models relating the activity of 130 estrogen receptor binders to specific structural features. In 3D-QSDAR, each compound is represented by a unique fingerprint constructed from (13)C chemical shift pairs and associated interatomic distances. Grids of different granularity can be used to partition the abstract fingerprint space into congruent "bins" for which the optimal size was previously unexplored. For this purpose, the endocrine disruptor knowledge base data were used to generate 50 3D-QSDAR models with bins ranging in size from 2 ppm × 2 ppm × 0.5 Å to 20 ppm × 20 ppm × 2.5 Å, each of which was validated using 100 training/test set partitions. Best average predictivity in terms of R(2)test was achieved at 10 ppm ×10 ppm × Z Å (Z = 0.5, ..., 2.5 Å). It was hypothesized that this optimum depends on the chemical shifts' estimation error (±4.13 ppm) and the precision of the calculated interatomic distances. The highest ranked bins from partial least-squares weights were found to be associated with structural features known to be essential for binding to the estrogen receptor.
一种改进的三维定量光谱数据-活性关系(3D-QSDAR)方法被用于建立和验证将 130 种雌激素受体结合物的活性与特定结构特征相关联的模型。在 3D-QSDAR 中,每个化合物都由一个独特的指纹表示,该指纹由(13)C 化学位移对和相关原子间距离构建而成。可以使用不同粒度的网格将抽象指纹空间划分为一致的“箱”,而之前尚未探索过最佳的箱大小。为此,利用内分泌干扰物知识库数据生成了 50 个 3D-QSDAR 模型,其箱大小范围从 2 ppm×2 ppm×0.5 Å 到 20 ppm×20 ppm×2.5 Å,每个模型都使用 100 个训练/测试集分区进行验证。在 R(2)test 方面,最佳平均预测性是在 10 ppm×10 ppm×Z Å(Z = 0.5,…,2.5 Å)处实现的。据推测,这种最佳值取决于化学位移的估计误差(±4.13 ppm)和计算原子间距离的精度。从偏最小二乘权重中排名最高的箱被发现与已知对与雌激素受体结合至关重要的结构特征相关。