Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518000, China.
Bioinformatics. 2022 Jul 11;38(14):3541-3548. doi: 10.1093/bioinformatics/btac374.
Phytopathogenic fungi secrete effector proteins to subvert host defenses and facilitate infection. Systematic analysis and prediction of candidate fungal effector proteins are crucial for experimental validation and biological control of plant disease. However, two problems are still considered intractable to be solved in fungal effector prediction: one is the high-level diversity in effector sequences that increases the difficulty of protein feature learning, and the other is the class imbalance between effector and non-effector samples in the training dataset.
In our study, pretrained deep representation learning methods are presented to represent multiple characteristics of sequences for predicting fungal effectors and generative adversarial networks are adapted to create synthetic feature samples to address the data imbalance problem. Compared with the state-of-the-art fungal effector prediction methods, Effector-GAN shows an overall improvement in accuracy in the independent test set.
Effector-GAN offers a user-friendly interface to inspect potential fungal effector proteins (http://lab.malab.cn/~wys/webserver/Effector-GAN). The Python script can be downloaded from http://lab.malab.cn/~wys/gitlab/effector-gan.
Supplementary data are available at Bioinformatics online.
植物病原真菌分泌效应蛋白来颠覆宿主防御并促进感染。系统地分析和预测候选真菌效应蛋白对于植物病害的实验验证和生物防治至关重要。然而,在真菌效应物预测中,仍然存在两个被认为难以解决的问题:一个是效应物序列的高度多样性增加了蛋白质特征学习的难度,另一个是训练数据集中效应物和非效应物样本之间的不平衡。
在我们的研究中,提出了预先训练的深度表示学习方法来表示序列的多种特征,以预测真菌效应物,并适应生成对抗网络来创建合成特征样本,以解决数据不平衡问题。与最先进的真菌效应物预测方法相比,Effector-GAN 在独立测试集中的准确性总体上有所提高。
Effector-GAN 提供了一个用户友好的界面来检查潜在的真菌效应蛋白(http://lab.malab.cn/~wys/webserver/Effector-GAN)。可以从 http://lab.malab.cn/~wys/gitlab/effector-gan 下载 Python 脚本。
补充数据可在生物信息学在线获得。