Laboratory of Computational Chemistry and Biochemistry, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015Lausanne, Switzerland.
J Chem Theory Comput. 2023 Feb 14;19(3):1080-1097. doi: 10.1021/acs.jctc.2c01078. Epub 2023 Jan 24.
Identification of the most stable structure(s) of a system is a prerequisite for the calculation of any of its properties from first-principles. However, even for relatively small molecules, exhaustive explorations of the potential energy surface (PES) are severely hampered by the dimensionality bottleneck. In this work, we address the challenging task of efficiently sampling realistic low-lying peptide coordinates by resorting to a surrogate based genetic algorithm (GA)/density functional theory (DFT) approach (sGADFT) in which promising candidates provided by the GA are ultimately optimized with DFT. We provide a benchmark of several computational methods (GAFF, AMOEBApro13, PM6, PM7, DFTB3-D3(BJ)) as possible prescanning surrogates and apply sGADFT to two test case systems that are (i) two isomer families of the protonated Gly-Pro-Gly-Gly tetrapeptide (Masson, A.; 2015, 26, 1444-1454) and (ii) the doubly protonated cyclic decapeptide gramicidin S (Nagornova, N. S.; 2010, 132, 4040-4041). We show that our GA procedure can correctly identify low-energy minima in as little as a few hours. Subsequent refinement of surrogate low-energy structures within a given energy threshold (≤10 kcal/mol (i), ≤5 kcal/mol (ii)) via DFT relaxation invariably led to the identification of the most stable structures as determined from high-resolution infrared (IR) spectroscopy at low temperature. The sGADFT method therefore constitutes a highly efficient route for the screening of realistic low-lying peptide structures in the gas phase as needed for instance for the interpretation and assignment of experimental IR spectra.
确定体系最稳定的结构是从第一性原理计算其任何性质的前提。然而,即使对于相对较小的分子,通过全枚举势能面(PES)的方法也受到维数瓶颈的严重限制。在这项工作中,我们通过基于替代的遗传算法(GA)/密度泛函理论(DFT)方法(sGADFT)来解决高效采样真实低能肽坐标的具有挑战性的任务,其中 GA 提供的有希望的候选者最终通过 DFT 进行优化。我们提供了几种计算方法(GAFF、AMOEBApro13、PM6、PM7、DFTB3-D3(BJ))作为可能的预扫描替代物的基准,并将 sGADFT 应用于两个测试案例系统,即(i)质子化 Gly-Pro-Gly-Gly 四肽的两个异构体家族(Masson,A.;2015 年,26,1444-1454)和(ii)双质子化环十肽杆菌肽 S(Nagornova,N. S.;2010 年,132,4040-4041)。我们表明,我们的 GA 程序可以在短短几个小时内正确识别低能极小值。通过 DFT 弛豫在给定的能量阈值(≤10 kcal/mol(i),≤5 kcal/mol(ii))内对替代低能结构进行细化,总是导致识别出最稳定的结构,如低温下的高分辨率红外(IR)光谱所确定的。因此,sGADFT 方法是筛选气相中真实低能肽结构的高效途径,例如对于实验 IR 光谱的解释和分配。