Wang Xiaocong, Gao Jun
Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University Wuhan China
RSC Adv. 2020 Jan 2;10(2):666-673. doi: 10.1039/c9ra09337k.
Furanoses that are components for many important biomolecules have complicated conformational spaces due to the flexible ring and -cyclic moieties. Machine learning algorithms, which require descriptors as structural inputs, can be used to efficiently compute conformational adaptive (CA) charges to capture the electrostatic potential variations caused by the conformational changes in the molecular mechanics (MM) calculations. In the present study, we introduced atom type symmetry function (ATSF) developed based on atom centered symmetry function (ACSF) for describing conformations for furanoses, in which atoms were categorized by atom types defined by their properties or connectivity in classic molecular mechanics (MM) force field parameters to generate a suitable coordinate size. Random forest regression (RFR) models with ATSF showed improvements for predicting CA charges and dipole moments for furanoses compared to those with ACSF and atom name symmetry functions where atoms were categorized by their unique atom names. The CA charges predicted by RFR models with ATSF showed more comparable reproductions of the carbohydrate-water and carbohydrate-protein interactions computed with RESP charges individually derived from QM calculations than the ensemble-averaged atomic charge sets commonly employed in molecular mechanics force fields, suggesting that the predicted CA charges were capable of including electrostatic variations in their dynamic charge values. Improvements by ATSF showed that categorizing atoms by atom types introduced chemical structural perceptions to descriptors and produced a suitable coordinate size in ATSF to capture key structural features for furanoses. This categorizing scheme also allows ATSF to be readily adopted by other biomolecules thanks to the broad implementations of MM force fields.
作为许多重要生物分子组成部分的呋喃糖,由于其灵活的环和环状部分,具有复杂的构象空间。机器学习算法需要描述符作为结构输入,可用于高效计算构象自适应(CA)电荷,以捕捉分子力学(MM)计算中构象变化引起的静电势变化。在本研究中,我们引入了基于原子中心对称函数(ACSF)开发的原子类型对称函数(ATSF)来描述呋喃糖的构象,其中原子根据经典分子力学(MM)力场参数中由其性质或连接性定义的原子类型进行分类,以生成合适的坐标尺寸。与使用ACSF和原子名称对称函数(其中原子按其独特的原子名称分类)的模型相比,具有ATSF的随机森林回归(RFR)模型在预测呋喃糖的CA电荷和偶极矩方面表现出改进。与分子力学力场中常用的系综平均原子电荷集相比,具有ATSF的RFR模型预测的CA电荷在分别从QM计算得出的RESP电荷计算的碳水化合物 - 水和碳水化合物 - 蛋白质相互作用中显示出更可比的再现性,这表明预测的CA电荷能够在其动态电荷值中纳入静电变化。ATSF的改进表明,按原子类型对原子进行分类为描述符引入了化学结构观念,并在ATSF中产生了合适的坐标尺寸,以捕捉呋喃糖的关键结构特征。由于MM力场的广泛应用,这种分类方案还允许ATSF很容易地被其他生物分子采用。