Suppr超能文献

DeepT3:使用 N 端序列,深度卷积神经网络准确识别革兰氏阴性菌 III 型分泌效应物。

DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence.

机构信息

School of Public Health, Southwest Medical University, Luzhou, Sichuan, PR, China.

Basic Medical College of Southwest Medical University, Luzhou, Sichuan, PR, China.

出版信息

Bioinformatics. 2019 Jun 1;35(12):2051-2057. doi: 10.1093/bioinformatics/bty931.

Abstract

MOTIVATION

Various bacterial pathogens can deliver their secreted substrates also called effectors through Type III secretion systems (T3SSs) into host cells and cause diseases. Since T3SS secreted effectors (T3SEs) play important roles in pathogen-host interactions, identifying them is crucial to our understanding of the pathogenic mechanisms of T3SSs. However, the effectors display high level of sequence diversity, therefore making the identification a difficult process. There is a need to develop a novel and effective method to screen and select putative novel effectors from bacterial genomes that can be validated by a smaller number of key experiments.

RESULTS

We develop a deep convolution neural network to directly classify any protein sequence into T3SEs or non-T3SEs, which is useful for both effector prediction and the study of sequence-function relationship. Different from traditional machine learning-based methods, our method automatically extracts T3SE-related features from a protein N-terminal sequence of 100 residues and maps it to the T3SEs space. We train and test our method on the datasets curated from 16 species, yielding an average classification accuracy of 83.7% in the 5-fold cross-validation and an accuracy of 92.6% for the test set. Moreover, when comparing with known state-of-the-art prediction methods, the accuracy of our method is 6.31-20.73% higher than previous methods on a common independent dataset. Besides, we visualize the convolutional kernels and successfully identify the key features of T3SEs, which contain important signal information for secretion. Finally, some effectors reported in the literature are used to further demonstrate the application of DeepT3.

AVAILABILITY AND IMPLEMENTATION

DeepT3 is freely available at: https://github.com/lje00006/DeepT3.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

各种细菌病原体可以通过 III 型分泌系统(T3SS)将其分泌的底物(也称为效应子)输送到宿主细胞中,并导致疾病。由于 T3SS 分泌的效应子(T3SEs)在病原体-宿主相互作用中发挥重要作用,因此识别它们对于我们理解 T3SS 的致病机制至关重要。然而,效应子表现出高度的序列多样性,因此使得识别过程变得困难。需要开发一种新的有效方法,从细菌基因组中筛选和选择假定的新型效应子,然后通过少量关键实验进行验证。

结果

我们开发了一种深度卷积神经网络,可以直接将任何蛋白质序列分类为 T3SEs 或非 T3SEs,这对于效应子预测和序列-功能关系的研究都很有用。与传统基于机器学习的方法不同,我们的方法可以自动从 100 个残基的蛋白质 N 端序列中提取 T3SE 相关特征,并将其映射到 T3SE 空间。我们在从 16 个物种中 curated 的数据集上进行了训练和测试,在 5 折交叉验证中的平均分类准确率为 83.7%,测试集的准确率为 92.6%。此外,与已知的最先进的预测方法相比,在一个共同的独立数据集上,我们的方法的准确率比以前的方法高 6.31-20.73%。此外,我们对卷积核进行了可视化,并成功识别出 T3SE 的关键特征,其中包含分泌的重要信号信息。最后,我们使用文献中报道的一些效应子进一步证明了 DeepT3 的应用。

可用性和实现

DeepT3 可在 https://github.com/lje00006/DeepT3 上免费获得。

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验