Suppr超能文献

基于机器学习模型和肽序列综合特征提高抗血管生成肽的预测。

Improved prediction of anti-angiogenic peptides based on machine learning models and comprehensive features from peptide sequences.

机构信息

Department of Computer Science and Information Engineering, Asia University, Taichung, 41354, Taiwan.

Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.

出版信息

Sci Rep. 2024 Jun 22;14(1):14387. doi: 10.1038/s41598-024-65062-9.

Abstract

Angiogenesis is a key process for the proliferation and metastatic spread of cancer cells. Anti-angiogenic peptides (AAPs), with the capability of inhibiting angiogenesis, are promising candidates in cancer treatment. We propose AAPL, a sequence-based predictor to identify AAPs with machine learning models of improved prediction accuracy. Each peptide sequence was transformed to a vector of 4335 numeric values according to 58 different feature types, followed by a heuristic algorithm for feature selection. Next, the hyperparameters of six machine learning models were optimized with respect to the feature subset. We considered two datasets, one with entire peptide sequences and the other with 15 amino acids from peptide N-termini. AAPL achieved Matthew's correlation coefficients of 0.671 and 0.756 for independent tests based on the two datasets, respectively, outperforming existing predictors by a range of 5.3% to 24.6%. Further analyses show that AAPL yields higher prediction accuracy for peptides with more hydrophobic residues, and fewer hydrophilic and charged residues. The source code of AAPL is available at https://github.com/yunzheng2002/Anti-angiogenic .

摘要

血管生成是癌细胞增殖和转移扩散的关键过程。具有抑制血管生成能力的抗血管生成肽(AAP)是癌症治疗中有前途的候选药物。我们提出了 AAPL,这是一种基于序列的预测器,可使用具有改进预测准确性的机器学习模型来识别 AAP。根据 58 种不同的特征类型,将每个肽序列转换为 4335 个数值向量,然后使用启发式算法进行特征选择。接下来,针对特征子集优化了六种机器学习模型的超参数。我们考虑了两个数据集,一个包含完整的肽序列,另一个包含肽 N 末端的 15 个氨基酸。AAPL 在基于这两个数据集的独立测试中分别实现了 0.671 和 0.756 的马修相关系数,优于现有预测器的范围为 5.3%至 24.6%。进一步的分析表明,AAPL 对疏水性残基较多、亲水性和带电荷残基较少的肽具有更高的预测准确性。AAPL 的源代码可在 https://github.com/yunzheng2002/Anti-angiogenic 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8c9/11193773/b176b61c35b4/41598_2024_65062_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验