• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过机器学习方法预测黑腹果蝇蛋白质组中肽段的液相色谱保留时间。

Predicting liquid chromatographic retention times of peptides from the Drosophila melanogaster proteome by machine learning approaches.

作者信息

Tian Feifei, Yang Li, Lv Fenglin, Zhou Peng

机构信息

College of Bioengineering, Chongqing University, Shazheng Road #174, Chongqing 400044, China.

出版信息

Anal Chim Acta. 2009 Jun 30;644(1-2):10-6. doi: 10.1016/j.aca.2009.04.010. Epub 2009 Apr 14.

DOI:10.1016/j.aca.2009.04.010
PMID:19463555
Abstract

Three machine learning algorithms as least-squares support vector machine (LSSVM), random forest (RF) and Gaussian process (GP) were used to model the quantitative structure-retention relationship (QSRR) for predicting and explaining the retention behavior of proteome-wide peptides in the reverse-phase liquid chromatography. Peptides were parameterized using CODESSA approach and 145 descriptors were obtained for each peptide, including diverse structural information such as constitutional, topological, geometrical and physicochemical property. Based upon that, the nonlinear LSSVM, RF and GP as well as another sophisticated linear method (partial least-squares regression (PLS)) were employed in the QSRR model development. By a series of systematic validations as internal cross-validation, external test and Monte Carlo cross-validation, the stability and predictive power of the constructed models were confirmed. Results show that regression models developed using nonlinear approaches such as LSSVM, RF and GP predict better than linear PLS models. Considering the retention times used in this work were measured in different columns and thus have a relatively large uncertainty (reproducibility within 7%), the optimal statistics obtained from GP modeling are satisfactory, with the coefficients of determination (R2) for training set and test set of 0.894 and 0.866, respectively.

摘要

使用三种机器学习算法,即最小二乘支持向量机(LSSVM)、随机森林(RF)和高斯过程(GP),对定量结构保留关系(QSRR)进行建模,以预测和解释反相液相色谱中全蛋白质组肽段的保留行为。使用CODESSA方法对肽段进行参数化,每个肽段获得145个描述符,包括各种结构信息,如组成、拓扑、几何和物理化学性质。在此基础上,将非线性LSSVM、RF和GP以及另一种复杂的线性方法(偏最小二乘回归(PLS))用于QSRR模型开发。通过一系列系统验证,如内部交叉验证、外部测试和蒙特卡洛交叉验证,证实了所构建模型的稳定性和预测能力。结果表明,使用LSSVM、RF和GP等非线性方法开发的回归模型比线性PLS模型预测效果更好。考虑到本研究中使用的保留时间是在不同色谱柱上测量的,因此具有相对较大的不确定性(重现性在7%以内),从GP建模获得的最佳统计结果令人满意,训练集和测试集的决定系数(R2)分别为0.894和0.866。

相似文献

1
Predicting liquid chromatographic retention times of peptides from the Drosophila melanogaster proteome by machine learning approaches.通过机器学习方法预测黑腹果蝇蛋白质组中肽段的液相色谱保留时间。
Anal Chim Acta. 2009 Jun 30;644(1-2):10-6. doi: 10.1016/j.aca.2009.04.010. Epub 2009 Apr 14.
2
Comprehensive comparison of eight statistical modelling methods used in quantitative structure-retention relationship studies for liquid chromatographic retention times of peptides generated by protease digestion of the Escherichia coli proteome.对用于大肠杆菌蛋白质组蛋白酶消化产生的肽的液相色谱保留时间的定量结构保留关系研究中使用的八种统计建模方法的综合比较。
J Chromatogr A. 2009 Apr 10;1216(15):3107-16. doi: 10.1016/j.chroma.2009.01.086. Epub 2009 Jan 31.
3
Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches.基于序列和结构方法的离子淌度质谱中肽漂移时间的建模和预测。
Comput Biol Med. 2011 May;41(5):272-7. doi: 10.1016/j.compbiomed.2011.03.002. Epub 2011 Mar 24.
4
Modeling and prediction of retention behavior of histidine-containing peptides in immobilized metal-affinity chromatography.含组氨酸肽在固定化金属亲和色谱中保留行为的建模与预测
J Sep Sci. 2009 Jun;32(12):2159-69. doi: 10.1002/jssc.200800739.
5
Retention prediction of peptides based on uninformative variable elimination by partial least squares.基于偏最小二乘法无信息变量消除的肽保留预测
J Proteome Res. 2006 Jul;5(7):1618-25. doi: 10.1021/pr0600430.
6
Novel approaches to predict the retention of histidine-containing peptides in immobilized metal-affinity chromatography.
Proteomics. 2008 Jun;8(11):2185-95. doi: 10.1002/pmic.200700788.
7
The evaluation of two-step multivariate adaptive regression splines for chromatographic retention prediction of peptides.用于肽色谱保留预测的两步多元自适应回归样条评估
Proteomics. 2007 May;7(10):1664-77. doi: 10.1002/pmic.200600676.
8
Investigation of different linear and nonlinear chemometric methods for modeling of retention index of essential oil components: concerns to support vector machine.用于精油成分保留指数建模的不同线性和非线性化学计量学方法研究:对支持向量机的关注
J Hazard Mater. 2009 Jul 30;166(2-3):853-9. doi: 10.1016/j.jhazmat.2008.11.097. Epub 2008 Dec 3.
9
Advanced QSRR modeling of peptides behavior in RPLC.反相高效液相色谱中肽行为的高级 QSRR 建模。
Talanta. 2010 Jun 15;81(4-5):1711-8. doi: 10.1016/j.talanta.2010.03.028. Epub 2010 Mar 25.
10
Accurate quantitative structure-property relationship model to predict the solubility of C60 in various solvents based on a novel approach using a least-squares support vector machine.基于使用最小二乘支持向量机的新方法预测C60在各种溶剂中溶解度的精确定量结构-性质关系模型。
J Phys Chem B. 2005 Nov 3;109(43):20565-71. doi: 10.1021/jp052223n.

引用本文的文献

1
Systematic Modeling, Prediction, and Comparison of Domain-Peptide Affinities: Does it Work Effectively With the Peptide QSAR Methodology?结构域-肽亲和力的系统建模、预测及比较:它在肽定量构效关系方法中是否有效?
Front Genet. 2022 Jan 14;12:800857. doi: 10.3389/fgene.2021.800857. eCollection 2021.
2
Locus-specific Retention Predictor (LsRP): A Peptide Retention Time Predictor Developed for Precision Proteomics.特定定位肽段保留预测器(LsRP):一种用于精准蛋白质组学的肽段保留时间预测器。
Sci Rep. 2017 Mar 17;7:43959. doi: 10.1038/srep43959.