基于机器学习模型和肽序列综合特征提高抗血管生成肽的预测。

Improved prediction of anti-angiogenic peptides based on machine learning models and comprehensive features from peptide sequences.

机构信息

Department of Computer Science and Information Engineering, Asia University, Taichung, 41354, Taiwan.

Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.

出版信息

Sci Rep. 2024 Jun 22;14(1):14387. doi: 10.1038/s41598-024-65062-9.

DOI:10.1038/s41598-024-65062-9

PMID:38909149

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11193773/

Abstract

Angiogenesis is a key process for the proliferation and metastatic spread of cancer cells. Anti-angiogenic peptides (AAPs), with the capability of inhibiting angiogenesis, are promising candidates in cancer treatment. We propose AAPL, a sequence-based predictor to identify AAPs with machine learning models of improved prediction accuracy. Each peptide sequence was transformed to a vector of 4335 numeric values according to 58 different feature types, followed by a heuristic algorithm for feature selection. Next, the hyperparameters of six machine learning models were optimized with respect to the feature subset. We considered two datasets, one with entire peptide sequences and the other with 15 amino acids from peptide N-termini. AAPL achieved Matthew's correlation coefficients of 0.671 and 0.756 for independent tests based on the two datasets, respectively, outperforming existing predictors by a range of 5.3% to 24.6%. Further analyses show that AAPL yields higher prediction accuracy for peptides with more hydrophobic residues, and fewer hydrophilic and charged residues. The source code of AAPL is available at https://github.com/yunzheng2002/Anti-angiogenic .

摘要

血管生成是癌细胞增殖和转移扩散的关键过程。具有抑制血管生成能力的抗血管生成肽（AAP）是癌症治疗中有前途的候选药物。我们提出了 AAPL，这是一种基于序列的预测器，可使用具有改进预测准确性的机器学习模型来识别 AAP。根据 58 种不同的特征类型，将每个肽序列转换为 4335 个数值向量，然后使用启发式算法进行特征选择。接下来，针对特征子集优化了六种机器学习模型的超参数。我们考虑了两个数据集，一个包含完整的肽序列，另一个包含肽 N 末端的 15 个氨基酸。AAPL 在基于这两个数据集的独立测试中分别实现了 0.671 和 0.756 的马修相关系数，优于现有预测器的范围为 5.3%至 24.6%。进一步的分析表明，AAPL 对疏水性残基较多、亲水性和带电荷残基较少的肽具有更高的预测准确性。AAPL 的源代码可在 https://github.com/yunzheng2002/Anti-angiogenic 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8c9/11193773/b176b61c35b4/41598_2024_65062_Fig1_HTML.jpg

相似文献

Improved prediction of anti-angiogenic peptides based on machine learning models and comprehensive features from peptide sequences.基于机器学习模型和肽序列综合特征提高抗血管生成肽的预测。

Sci Rep. 2024 Jun 22;14(1):14387. doi: 10.1038/s41598-024-65062-9.

Stack-AAgP: Computational prediction and interpretation of anti-angiogenic peptides using a meta-learning framework.Stack-AAgP：使用元学习框架进行抗血管生成肽的计算预测和解释。

Comput Biol Med. 2024 May;174:108438. doi: 10.1016/j.compbiomed.2024.108438. Epub 2024 Apr 9.

Review and Comparative Analysis of Machine Learning-based Predictors for Predicting and Analyzing Anti-angiogenic Peptides.基于机器学习的抗血管生成肽预测和分析预测因子的回顾与比较分析。

Curr Med Chem. 2022;29(5):849-864. doi: 10.2174/0929867328666210810145806.

Prediction of high anti-angiogenic activity peptides in silico using a generalized linear model and feature selection.使用广义线性模型和特征选择对高抗血管生成活性肽进行计算机预测。

Sci Rep. 2018 Oct 24;8(1):15688. doi: 10.1038/s41598-018-33911-z.

TargetAntiAngio: A Sequence-Based Tool for the Prediction and Analysis of Anti-Angiogenic Peptides.TargetAntiAngio：一种基于序列的抗血管生成肽预测和分析工具。

Int J Mol Sci. 2019 Jun 17;20(12):2950. doi: 10.3390/ijms20122950.

ENCAP: Computational prediction of tumor T cell antigens with ensemble classifiers and diverse sequence features.ENCAP：使用集成分类器和多种序列特征进行肿瘤 T 细胞抗原的计算预测。

PLoS One. 2024 Jul 18;19(7):e0307176. doi: 10.1371/journal.pone.0307176. eCollection 2024.

ACP-Dnnel: anti-coronavirus peptides' prediction based on deep neural network ensemble learning.ACP-Dnnel：基于深度神经网络集成学习的抗冠状病毒肽预测

Amino Acids. 2023 Sep;55(9):1121-1136. doi: 10.1007/s00726-023-03300-6. Epub 2023 Jul 4.

CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder.CAPTURE：具有独特氨基酸序列编码器的综合抗癌肽预测器。

Comput Biol Med. 2024 Jun;176:108538. doi: 10.1016/j.compbiomed.2024.108538. Epub 2024 May 3.

AntAngioCOOL: computational detection of anti-angiogenic peptides.AntAngioCOOL：抗血管生成肽的计算检测。

J Transl Med. 2019 Mar 4;17(1):71. doi: 10.1186/s12967-019-1813-7.

PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins.PeNGaRoo，一种组合梯度提升和集成学习框架，用于预测非经典分泌蛋白。

Bioinformatics. 2020 Feb 1;36(3):704-712. doi: 10.1093/bioinformatics/btz629.

引用本文的文献

AAGP integrates physicochemical and compositional features for machine learning-based prediction of anti-aging peptides.AAGP整合物理化学和组成特征，用于基于机器学习的抗衰肽预测。

Sci Rep. 2025 Aug 8;15(1):29036. doi: 10.1038/s41598-025-12759-0.

本文引用的文献

Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.2022 年全球癌症统计数据：全球 185 个国家和地区 36 种癌症的发病率和死亡率全球估计数。

CA Cancer J Clin. 2024 May-Jun;74(3):229-263. doi: 10.3322/caac.21834. Epub 2024 Apr 4.

Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models.Pfeature：一种用于计算广泛的蛋白质特征并构建预测模型的工具。

J Comput Biol. 2023 Feb;30(2):204-222. doi: 10.1089/cmb.2022.0241. Epub 2022 Oct 13.

AAPred-CNN: Accurate predictor based on deep convolution neural network for identification of anti-angiogenic peptides.AAPred-CNN：基于深度卷积神经网络的抗血管生成肽识别准确预测器。

Methods. 2022 Aug;204:442-448. doi: 10.1016/j.ymeth.2022.01.004. Epub 2022 Jan 12.

TargetAntiAngio: A Sequence-Based Tool for the Prediction and Analysis of Anti-Angiogenic Peptides.TargetAntiAngio：一种基于序列的抗血管生成肽预测和分析工具。

Int J Mol Sci. 2019 Jun 17;20(12):2950. doi: 10.3390/ijms20122950.

AntAngioCOOL: computational detection of anti-angiogenic peptides.AntAngioCOOL：抗血管生成肽的计算检测。

J Transl Med. 2019 Mar 4;17(1):71. doi: 10.1186/s12967-019-1813-7.

Sci Rep. 2018 Oct 24;8(1):15688. doi: 10.1038/s41598-018-33911-z.

Therapeutic application of anti-angiogenic nanomaterials in cancers.抗血管生成纳米材料在癌症治疗中的应用。

Nanoscale. 2016 Jul 7;8(25):12444-70. doi: 10.1039/c5nr07887c. Epub 2016 Apr 12.

Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences.利用最大依赖分解从一组对齐的信号序列中识别保守基序。

Bioinformatics. 2011 Jul 1;27(13):1780-7. doi: 10.1093/bioinformatics/btr291. Epub 2011 May 6.

Anti-angiogenic peptides for cancer therapeutics.用于癌症治疗的抗血管生成肽。

Curr Pharm Biotechnol. 2011 Aug;12(8):1101-16. doi: 10.2174/138920111796117300.

Contribution of hydrophobic interactions to protein stability.疏水性相互作用对蛋白质稳定性的贡献。

J Mol Biol. 2011 May 6;408(3):514-28. doi: 10.1016/j.jmb.2011.02.053. Epub 2011 Mar 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于机器学习模型和肽序列综合特征提高抗血管生成肽的预测。

Improved prediction of anti-angiogenic peptides based on machine learning models and comprehensive features from peptide sequences.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献