Suppr超能文献

QSSR 建模在离子淌度光谱法中枯草芽孢杆菌脂肪酶 A 肽碰撞截面:局部描述符与全局描述符。

QSSR Modeling of Bacillus Subtilis Lipase A Peptide Collision Cross-Sections in Ion Mobility Spectrometry: Local Descriptor Versus Global Descriptor.

机构信息

School of Life Sciences, Jiangsu University, Zhenjiang, 212013, China.

Key Lab of Reproduction Regulation of NPFPC-Shanghai Institute of Planned Parenthood Research (SIPPR), Fudan University Reproduction and Development Institution, Shanghai, China.

出版信息

Protein J. 2021 Feb;40(1):54-62. doi: 10.1007/s10930-020-09960-7. Epub 2021 Jan 16.

Abstract

To investigate the structure-dependent peptide mobility behavior in ion mobility spectrometry (IMS), quantitative structure-spectrum relationship (QSSR) is systematically modeled and predicted for the collision cross section Ω values of totally 162 single-protonated tripeptide fragments extracted from the Bacillus subtilis lipase A. Two different types of structure characterization methods, namely, local and global descriptor as well as three machine learning methods, namely, partial least squares (PLS), support vector machine (SVM) and Gaussian process (GP), are employed to parameterize and correlate the structures and Ω values of these peptide samples. In this procedure, the local descriptor is derived from the principal component analysis (PCA) of 516 physicochemical properties for 20 standard amino acids, which can be used to sequentially characterize the three amino acid residues composing a tripeptide. The global descriptor is calculated using CODESSA method, which can generate > 200 statistically significant variables to characterize the whole molecular structure of a tripeptide. The obtained QSSR models are evaluated rigorously via tenfold cross-validation and Monte Carlo cross-validation (MCCV). A comprehensive comparison is performed on the resulting statistics arising from the systematic combination of different descriptor types and machine learning methods. It is revealed that the local descriptor-based QSSR models have a better fitting ability and predictive power, but worse interpretability, than those based on the global descriptor. In addition, since the QSSR modeling using local descriptor does not consider the three-dimensional conformation of tripeptide samples, the method would be largely efficient as compared to the global descriptor.

摘要

为了研究离子淌度谱(IMS)中肽段的结构依赖性迁移行为,我们系统地建立和预测了定量结构-谱关系(QSSR),用于预测从枯草芽孢杆菌脂肪酶 A 中提取的 162 个单质子化三肽片段的碰撞截面 Ω 值。我们采用了两种不同类型的结构描述符,即局部和全局描述符,以及三种机器学习方法,即偏最小二乘法(PLS)、支持向量机(SVM)和高斯过程(GP),对这些肽样品的结构和 Ω 值进行参数化和关联。在这个过程中,局部描述符是从 20 种标准氨基酸的 516 种物理化学性质的主成分分析(PCA)中得到的,它可以用于顺序描述构成三肽的三个氨基酸残基。全局描述符是使用 CODESSA 方法计算的,它可以生成 >200 个具有统计学意义的变量来描述三肽的整个分子结构。通过十折交叉验证和蒙特卡罗交叉验证(MCCV)严格评估了所得 QSSR 模型。对不同描述符类型和机器学习方法的系统组合所产生的结果统计数据进行了全面比较。结果表明,基于局部描述符的 QSSR 模型具有更好的拟合能力和预测能力,但解释能力较差,而基于全局描述符的模型则相反。此外,由于基于局部描述符的 QSSR 建模不考虑三肽样品的三维构象,因此与全局描述符相比,该方法的效率会大大提高。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验