Suppr超能文献

Pred-AHCP:通过机器学习实现基于强大特征选择的抗丙型肝炎肽序列特异性预测。

Pred-AHCP: Robust Feature Selection-Enabled Sequence-Specific Prediction of Anti-Hepatitis C Peptides via Machine Learning.

作者信息

Saraswat Akash, Sharma Utsav, Gandotra Aryan, Wasan Lakshit, Artham Sainithin, Maitra Arijit, Singh Bipin

机构信息

Department of Applied Sciences, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India.

Department of Computer Science and Engineering, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana 122413, India.

出版信息

J Chem Inf Model. 2024 Dec 23;64(24):9111-9124. doi: 10.1021/acs.jcim.4c00900. Epub 2024 Nov 6.

Abstract

Every year, an estimated 1.5 million people worldwide contract Hepatitis C, a significant contributor to liver problems. Although many studies have explored machine learning's potential to predict antiviral peptides, very few have addressed the problem of predicting peptides against specific viruses such as Hepatitis C. In this study, we demonstrate the application and fine-tuning of machine learning (ML) algorithms to predict peptides that are effective against Hepatitis C virus (HCV). We developed a fine-tuned and explainable ML model that harnesses the amino acid sequence of a peptide to predict its anti-hepatitis C potential. Specifically, features were computed based on sequence and physicochemical properties. The feature selection was performed using a combined strategy of mutual information and variance inflation factor. This facilitated the removal of redundant and multicollinear features, enhancing the model's generalizability in predicting anti-hepatitis C peptides (AHCPs). The model using the random forest algorithm produced the best performance with an accuracy of about 92%. The feature analysis highlights that the distributions of hydrophobicity, polarizability, coil-forming residues, frequency of glycine residues and the existence of dipeptide motifs VL, LV, and CC emerged as the key predictors for identifying AHCPs targeting different components of HCV. The developed model can be accessed through the Pred-AHCP web server, provided at http://tinyurl.com/web-Pred-AHCP. This resource facilitates the prediction and re-engineering of AHCPs for designing peptide-based therapeutics while also proposing an exploration of similar strategies for designing peptide inhibitors effective against other viruses. The developed ML model can also be used for validating peptide sequences generated using generative artificial intelligence methods for further optimization.

摘要

据估计,全球每年有150万人感染丙型肝炎,这是导致肝脏问题的一个重要因素。尽管许多研究探索了机器学习在预测抗病毒肽方面的潜力,但很少有研究涉及针对特定病毒(如丙型肝炎病毒)的肽预测问题。在本研究中,我们展示了机器学习(ML)算法在预测抗丙型肝炎病毒(HCV)有效肽方面的应用和微调。我们开发了一个经过微调且可解释的ML模型,该模型利用肽的氨基酸序列来预测其抗丙型肝炎潜力。具体而言,基于序列和物理化学性质计算特征。使用互信息和方差膨胀因子的组合策略进行特征选择。这有助于去除冗余和多重共线性特征,提高模型在预测抗丙型肝炎肽(AHCPs)方面的泛化能力。使用随机森林算法的模型表现最佳,准确率约为92%。特征分析突出表明,疏水性、极化率、形成卷曲的残基、甘氨酸残基频率以及二肽基序VL、LV和CC的存在分布,成为识别针对HCV不同成分的AHCPs的关键预测因子。所开发的模型可通过Pred - AHCP网络服务器访问,网址为http://tinyurl.com/web - Pred - AHCP。该资源有助于AHCPs的预测和重新设计,以用于设计基于肽的疗法,同时还提出探索类似策略来设计对其他病毒有效的肽抑制剂。所开发的ML模型还可用于验证使用生成式人工智能方法生成的肽序列,以进行进一步优化。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验