Wilkes Jon G, Stoyanova-Slavova Iva B, Buzatu Dan A
Division of Systems Biology, National Center for Toxicological Research, 3900 NCTR Road, Jefferson, AR, 72079, USA.
J Comput Aided Mol Des. 2016 Apr;30(4):331-45. doi: 10.1007/s10822-016-9909-0. Epub 2016 Mar 30.
Molecular biochemistry is controlled by 3D phenomena but structure-activity models based on 3D descriptors are infrequently used for large data sets because of the computational overhead for determining molecular conformations. A diverse dataset of 146 androgen receptor binders was used to investigate how different methods for defining molecular conformations affect the performance of 3D-quantitative spectral data activity relationship models. Molecular conformations tested: (1) global minimum of molecules' potential energy surface; (2) alignment-to-templates using equal electronic and steric force field contributions; (3) alignment using contributions "Best-for-Each" template; (4) non-energy optimized, non-aligned (2D > 3D). Aggregate predictions from models were compared. Highest average coefficients of determination ranged from R Test (2) = 0.56 to 0.61. The best model using 2D > 3D (imported directly from ChemSpider) produced R Test (2) = 0.61. It was superior to energy-minimized and conformation-aligned models and was achieved in only 3-7 % of the time required using the other conformation strategies. Predictions averaged from models built on different conformations achieved a consensus R Test (2) = 0.65. The best 2D > 3D model was analyzed for underlying structure-activity relationships. For the compound strongest binding to the androgen receptor, 10 substructural features contributing to binding were flagged. Utility of 2D > 3D was compared for two other activity endpoints, each modeling a medium sized data set. Results suggested that large scale, accurate predictions using 2D > 3D SDAR descriptors may be produced for interactions involving endocrine system nuclear receptors and other data sets in which strongest activities are produced by fairly inflexible substrates.
分子生物化学受三维现象控制,但基于三维描述符的构效模型很少用于大数据集,因为确定分子构象存在计算开销。使用一个包含146种雄激素受体结合剂的多样化数据集来研究定义分子构象的不同方法如何影响三维定量光谱数据活性关系模型的性能。测试的分子构象:(1)分子势能面的全局最小值;(2)使用相等的电子和空间力场贡献与模板对齐;(3)使用“最适合每个”模板的贡献进行对齐;(4)非能量优化、未对齐(二维>三维)。比较了模型的总体预测。最高平均决定系数范围为R检验(2)=0.56至0.61。使用二维>三维(直接从ChemSpider导入)的最佳模型产生R检验(2)=0.61。它优于能量最小化和构象对齐模型,并且仅在使用其他构象策略所需时间的3%-7%内即可实现。基于不同构象构建的模型的平均预测得出共识R检验(2)=0.65。对最佳二维>三维模型进行了潜在构效关系分析。对于与雄激素受体结合最强的化合物,标记了10个有助于结合的亚结构特征。比较了二维>三维在另外两个活性终点的效用,每个终点对一个中等规模的数据集进行建模。结果表明,对于涉及内分泌系统核受体的相互作用以及其他由相当刚性的底物产生最强活性的数据集,可以使用二维>三维光谱数据活性关系描述符进行大规模、准确的预测。