School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou 310018, China.
School of Geography, University of Leeds, Leeds LS2 9JT, UK.
Genes (Basel). 2024 Sep 5;15(9):1170. doi: 10.3390/genes15091170.
Virulencefactors (VFs) are key molecules that enable pathogens to evade the immune systems of the host. These factors are crucial for revealing the pathogenic processes of microbes and drug discovery. Identification of virulence factors in microbes become an important problem in the field of bioinformatics. To address this problem, this study proposes a novel model DTVF (Deep Transfer Learning for Virulence Factor Prediction), which integrates the ProtT5 protein sequence extraction model with a dual-channel deep learning model. In the dual-channel deep learning model, we innovatively integrate long short-term memory (LSTM) with convolutional neural networks (CNNs), creating a novel integrated architecture. Furthermore, by incorporating the attention mechanism, the accuracy of VF detection was significantly enhanced. We evaluated the DTVF model against other excellent-performing models in the field. DTVF demonstrates superior performance, achieving an accuracy rate of 84.55% and an AUROC of 92.08% on the benchmark dataset. DTVF shows state-of-the-art performance in this field, surpassing the existing models in nearly all metrics. To facilitate the use of biologists, we have also developed an interactive web-based user interface version of DTVF based on Gradio.
毒力因子(Virulence Factors,VFs)是使病原体逃避宿主免疫系统的关键分子。这些因子对于揭示微生物的致病过程和药物发现至关重要。鉴定微生物中的毒力因子已成为生物信息学领域的一个重要问题。为了解决这个问题,本研究提出了一种新颖的模型 DTVF(用于毒力因子预测的深度迁移学习),该模型将 ProtT5 蛋白质序列提取模型与双通道深度学习模型集成在一起。在双通道深度学习模型中,我们创新性地将长短期记忆(Long Short-Term Memory,LSTM)与卷积神经网络(Convolutional Neural Networks,CNNs)相结合,创建了一种新颖的集成架构。此外,通过引入注意力机制,显著提高了 VF 检测的准确性。我们在基准数据集上对 DTVF 模型与其他表现优异的模型进行了评估。DTVF 在该领域表现出色,在基准数据集上的准确率达到 84.55%,AUROC 达到 92.08%。DTVF 在几乎所有指标上都超越了现有模型,实现了该领域的最新性能。为了方便生物学家的使用,我们还基于 Gradio 开发了一个交互式的基于网络的 DTVF 用户界面版本。