利用随机森林识别氨基酸的关键物理化学性质，以区分抗癌肽和非抗癌肽。

Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides.

机构信息

College of Biomedical Engineering, Sichuan University, Chengdu 610065, China.

College of Life Science, Sichuan University, Chengdu 610065, China.

出版信息

Int J Mol Sci. 2023 Jun 29;24(13):10854. doi: 10.3390/ijms241310854.

DOI:10.3390/ijms241310854

PMID:37446031

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10341712/

Abstract

Anticancer peptides (ACPs) represent a promising new therapeutic approach in cancer treatment. They can target cancer cells without affecting healthy tissues or altering normal physiological functions. Machine learning algorithms have increasingly been utilized for predicting peptide sequences with potential ACP effects. This study analyzed four benchmark datasets based on a well-established random forest (RF) algorithm. The peptide sequences were converted into 566 physicochemical features extracted from the amino acid index (AAindex) library, which were then subjected to feature selection using four methods: light gradient-boosting machine (LGBM), analysis of variance (ANOVA), chi-squared test (Chi), and mutual information (MI). Presenting and merging the identified features using Venn diagrams, 19 key amino acid physicochemical properties were identified that can be used to predict the likelihood of a peptide sequence functioning as an ACP. The results were quantified by performance evaluation metrics to determine the accuracy of predictions. This study aims to enhance the efficiency of designing peptide sequences for cancer treatment.

摘要

抗癌肽 (ACPs) 代表了癌症治疗中一种有前途的新治疗方法。它们可以靶向癌细胞，而不会影响健康组织或改变正常的生理功能。机器学习算法越来越多地被用于预测具有潜在 ACP 效应的肽序列。本研究基于一种成熟的随机森林 (RF) 算法，对四个基准数据集进行了分析。将肽序列转换为从氨基酸指数 (AAindex) 库中提取的 566 种理化特征，然后使用四种方法（轻梯度提升机 (LGBM)、方差分析 (ANOVA)、卡方检验 (Chi) 和互信息 (MI)）进行特征选择。通过 Venn 图展示和合并鉴定出的特征，确定了 19 个关键的氨基酸理化性质，可以用于预测肽序列作为 ACP 发挥作用的可能性。通过性能评估指标对结果进行量化，以确定预测的准确性。本研究旨在提高设计用于癌症治疗的肽序列的效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a34/10341712/4e6d9ccad56c/ijms-24-10854-g001.jpg

相似文献

Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides.

Int J Mol Sci. 2023 Jun 29;24(13):10854. doi: 10.3390/ijms241310854.

Accelerating the Discovery of Anticancer Peptides through Deep Forest Architecture with Deep Graphical Representation.

Int J Mol Sci. 2023 Feb 21;24(5):4328. doi: 10.3390/ijms24054328.

ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides.

Int J Mol Sci. 2022 Oct 13;23(20):12194. doi: 10.3390/ijms232012194.

Integrating multiple sequence features for identifying anticancer peptides.

Comput Biol Chem. 2022 Aug;99:107711. doi: 10.1016/j.compbiolchem.2022.107711. Epub 2022 Jun 1.

G-ACP: a machine learning approach to the prediction of therapeutic peptides for gastric cancer.

J Biomol Struct Dyn. 2024 Mar 7:1-14. doi: 10.1080/07391102.2024.2323141.

ACP-MLC: A two-level prediction engine for identification of anticancer peptides and multi-label classification of their functional types.

Comput Biol Med. 2023 May;158:106844. doi: 10.1016/j.compbiomed.2023.106844. Epub 2023 Apr 4.

Effective identification and differential analysis of anticancer peptides.

Biosystems. 2024 Jul;241:105246. doi: 10.1016/j.biosystems.2024.105246. Epub 2024 Jun 5.

Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding.

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac630.

Predicting and analyzing DNA-binding domains using a systematic approach to identifying a set of informative physicochemical and biochemical properties.

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S47. doi: 10.1186/1471-2105-12-S1-S47.

ACP-GBDT: An improved anticancer peptide identification method with gradient boosting decision tree.

Front Genet. 2023 Mar 29;14:1165765. doi: 10.3389/fgene.2023.1165765. eCollection 2023.

引用本文的文献

Dynamic Visualization of Computer-Aided Peptide Design for Cancer Therapeutics.

Drug Des Devel Ther. 2025 Feb 15;19:1043-1065. doi: 10.2147/DDDT.S497126. eCollection 2025.

Predicting viral proteins that evade the innate immune system: a machine learning-based immunoinformatics tool.

BMC Bioinformatics. 2024 Nov 9;25(1):351. doi: 10.1186/s12859-024-05972-7.

dsAMP and dsAMPGAN: Deep Learning Networks for Antimicrobial Peptides Recognition and Generation.

Antibiotics (Basel). 2024 Oct 9;13(10):948. doi: 10.3390/antibiotics13100948.

iAMP-Attenpred: a novel antimicrobial peptide predictor based on BERT feature extraction method and CNN-BiLSTM-Attention combination model.

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad443.

本文引用的文献

Biological Sequence Classification: A Review on Data and General Methods.

Research (Wash D C). 2022 Dec 19;2022:0011. doi: 10.34133/research.0011. eCollection 2022.

Concatenated Xception-ResNet50 - A novel hybrid approach for accurate skin cancer prediction.

Comput Biol Med. 2022 Nov;150:106170. doi: 10.1016/j.compbiomed.2022.106170. Epub 2022 Oct 4.

Integrated analysis of ovarian cancer patients from prospective transcription factor activity reveals subtypes of prognostic significance.

Heliyon. 2023 May 11;9(5):e16147. doi: 10.1016/j.heliyon.2023.e16147. eCollection 2023 May.

TriNet: A tri-fusion neural network for the prediction of anticancer and antimicrobial peptides.

Patterns (N Y). 2023 Feb 28;4(3):100702. doi: 10.1016/j.patter.2023.100702. eCollection 2023 Mar 10.

Accelerating the Discovery of Anticancer Peptides through Deep Forest Architecture with Deep Graphical Representation.

Int J Mol Sci. 2023 Feb 21;24(5):4328. doi: 10.3390/ijms24054328.

Bitter-RF: A random forest machine model for recognizing bitter peptides.

Front Med (Lausanne). 2023 Jan 26;10:1052923. doi: 10.3389/fmed.2023.1052923. eCollection 2023.

Potent antibiotic design via guided search from antibacterial activity evaluations.

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad059.

A random forest-based metabolic risk model to assess the prognosis and metabolism-related drug targets in ovarian cancer.

Comput Biol Med. 2023 Feb;153:106432. doi: 10.1016/j.compbiomed.2022.106432. Epub 2022 Dec 16.

AcrPred: A hybrid optimization with enumerated machine learning algorithm to predict Anti-CRISPR proteins.

Int J Biol Macromol. 2023 Feb 15;228:706-714. doi: 10.1016/j.ijbiomac.2022.12.250. Epub 2022 Dec 28.

IUP-BERT: Identification of Umami Peptides Based on BERT Features.

Foods. 2022 Nov 21;11(22):3742. doi: 10.3390/foods11223742.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用随机森林识别氨基酸的关键物理化学性质，以区分抗癌肽和非抗癌肽。

Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides.

机构信息

College of Biomedical Engineering, Sichuan University, Chengdu 610065, China.

College of Life Science, Sichuan University, Chengdu 610065, China.

出版信息

Int J Mol Sci. 2023 Jun 29;24(13):10854. doi: 10.3390/ijms241310854.

DOI:10.3390/ijms241310854

PMID:37446031

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10341712/

Abstract

摘要

利用随机森林识别氨基酸的关键物理化学性质，以区分抗癌肽和非抗癌肽。

Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用随机森林识别氨基酸的关键物理化学性质，以区分抗癌肽和非抗癌肽。

Using the Random Forest for Identifying Key Physicochemical Properties of Amino Acids to Discriminate Anticancer and Non-Anticancer Peptides.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献