Kao Hui-Ju, Weng Tzu-Hsiang, Chen Chia-Hung, Yu Chen-Lin, Chen Yu-Chi, Huang Chen-Chen, Huang Kai-Yao, Weng Shun-Long
Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 30071, Taiwan.
Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 30068, Taiwan.
Int J Mol Sci. 2025 Jun 3;26(11):5356. doi: 10.3390/ijms26115356.
Hepatitis C virus (HCV) infection remains a significant global health burden, driven by the emergence of drug-resistant strains and the limited efficacy of current antiviral therapies. A promising strategy for therapeutic intervention involves targeting the NS3 protease, a viral enzyme essential for replication. In this study, we present the first computational model specifically designed to identify NS3 protease inhibitory peptides (NS3IPs). Using amino acid composition (AAC) and K-spaced amino acid pair composition (CKSAAP) features, we developed machine learning classifiers based on support vector machine (SVM) and random forest (RF), achieving accuracies of 98.85% and 97.83%, respectively, validated through 5-fold cross-validation and independent testing. To support the accessibility of the strategy, we implemented a web-based tool, iDNS3IP, which enables real-time prediction of NS3IPs. In addition, we performed feature space analyses using PCA, t-SNE, and LDA based on AAindex descriptors. The resulting visualizations showed a distinguishable clustering between NS3IPs and non-inhibitory peptides, suggesting that inhibitory activity may correlate with characteristic physicochemical patterns. This study provides a reliable and interpretable platform to assist in the discovery of therapeutic peptides and supports continued research into peptide-based antiviral strategies for drug-resistant HCV. To enhance its flexibility, the iDNS3IP web tool also incorporates a BLAST-based similarity search function, enabling users to evaluate inhibitory candidates from both predictive and homology-based perspectives.
丙型肝炎病毒(HCV)感染仍然是一个重大的全球健康负担,这是由耐药菌株的出现和当前抗病毒疗法的有限疗效所驱动的。一种有前景的治疗干预策略涉及靶向NS3蛋白酶,这是一种对病毒复制至关重要的酶。在本研究中,我们提出了首个专门设计用于识别NS3蛋白酶抑制肽(NS3IPs)的计算模型。利用氨基酸组成(AAC)和K间隔氨基酸对组成(CKSAAP)特征,我们基于支持向量机(SVM)和随机森林(RF)开发了机器学习分类器,通过5折交叉验证和独立测试验证,准确率分别达到98.85%和97.83%。为了支持该策略的可及性,我们实现了一个基于网络的工具iDNS3IP,它能够实时预测NS3IPs。此外,我们基于AAindex描述符使用主成分分析(PCA)、t-分布随机邻域嵌入(t-SNE)和线性判别分析(LDA)进行了特征空间分析。所得可视化结果显示NS3IPs和非抑制肽之间存在可区分的聚类,表明抑制活性可能与特征性的物理化学模式相关。本研究提供了一个可靠且可解释的平台,以协助发现治疗性肽,并支持对耐药HCV基于肽的抗病毒策略的持续研究。为了提高其灵活性,iDNS3IP网络工具还纳入了基于BLAST的相似性搜索功能,使用户能够从预测和基于同源性的角度评估抑制性候选物。