Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA.
Quantum Theory Project, The University of Florida, 2328 New Physics Building, Gainesville, Florida 32611-8435, USA.
J Chem Phys. 2017 Oct 28;147(16):161727. doi: 10.1063/1.5001028.
Accurate potential energy models are necessary for reliable atomistic simulations of chemical phenomena. In the realm of biomolecular modeling, large systems like proteins comprise very many noncovalent interactions (NCIs) that can contribute to the protein's stability and structure. This work presents two high-quality chemical databases of common fragment interactions in biomolecular systems as extracted from high-resolution Protein DataBank crystal structures: 3380 sidechain-sidechain interactions and 100 backbone-backbone interactions that inaugurate the BioFragment Database (BFDb). Absolute interaction energies are generated with a computationally tractable explicitly correlated coupled cluster with perturbative triples [CCSD(T)-F12] "silver standard" (0.05 kcal/mol average error) for NCI that demands only a fraction of the cost of the conventional "gold standard," CCSD(T) at the complete basis set limit. By sampling extensively from biological environments, BFDb spans the natural diversity of protein NCI motifs and orientations. In addition to supplying a thorough assessment for lower scaling force-field (2), semi-empirical (3), density functional (244), and wavefunction (45) methods (comprising >1M interaction energies), BFDb provides interactive tools for running and manipulating the resulting large datasets and offers a valuable resource for potential energy model development and validation.
准确的势能模型对于化学现象的可靠原子模拟是必要的。在生物分子建模领域,蛋白质等大型系统包含许多非共价相互作用(NCIs),这些相互作用有助于蛋白质的稳定性和结构。这项工作从高分辨率蛋白质数据库晶体结构中提取了生物分子系统中常见片段相互作用的两个高质量化学数据库:3380 个侧链-侧链相互作用和 100 个骨架-骨架相互作用,开创了生物片段数据库(BFDb)。绝对相互作用能是通过具有计算可处理性的显式相关耦合簇与微扰三分量 [CCSD(T)-F12]“银标准”(NCI 的平均误差为 0.05 kcal/mol)生成的,该方法仅需要传统“金标准”CCSD(T)在完全基组极限的一小部分成本。通过从生物环境中广泛采样,BFDb 涵盖了蛋白质 NCI 基序和取向的自然多样性。除了为较低的比例力场(2)、半经验(3)、密度泛函(244)和波函数(45)方法(包括>1M 相互作用能)提供全面评估外,BFDb 还提供了用于运行和操作由此产生的大型数据集的交互工具,并为势能模型开发和验证提供了有价值的资源。