Liu Zhe, Gong Yingli, Guo Yuanzhao, Zhang Xiao, Lu Chang, Zhang Li, Wang Han
School of Computer Science and Engineering, Changchun University of Technology, Changchun, China.
School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
Front Genet. 2021 Mar 15;12:656140. doi: 10.3389/fgene.2021.656140. eCollection 2021.
Transmembrane protein (TMP) is an important type of membrane protein that is involved in various biological membranes related biological processes. As major drug targets, TMPs' surfaces are highly concerned to form the structural biases of their material-bindings for drugs or other biological molecules. However, the quantity of determinate TMP structures is still far less than the requirements, while artificial intelligence technologies provide a promising approach to accurately identify the TMP surfaces, merely depending on their sequences without any feature-engineering. For this purpose, we present an updated TMP surface residue predictor TMP-SSurface2 which achieved an even higher prediction accuracy compared to our previous version. The method uses an attention-enhanced Bidirectional Long Short Term Memory (BiLSTM) network, benefiting from its efficient learning capability, some useful latent information is abstracted from protein sequences, thus improving the Pearson correlation coefficients (CC) value performance of the old version from 0.58 to 0.66 on an independent test dataset. The results demonstrate that TMP-SSurface2 is efficient in predicting the surface of transmembrane proteins, representing new progress in transmembrane protein structure modeling based on primary sequences. TMP-SSurface2 is freely accessible at https://github.com/NENUBioCompute/TMP-SSurface-2.0.
跨膜蛋白(TMP)是一种重要的膜蛋白类型,参与各种与生物膜相关的生物学过程。作为主要的药物靶点,TMP的表面高度受关注,以形成其与药物或其他生物分子结合的结构偏好。然而,已确定的TMP结构数量仍远远低于需求,而人工智能技术提供了一种有前景的方法,仅根据其序列而无需任何特征工程就能准确识别TMP表面。为此,我们提出了一种更新的TMP表面残基预测器TMP-SSurface2,与我们之前的版本相比,它实现了更高的预测准确率。该方法使用注意力增强的双向长短期记忆(BiLSTM)网络,受益于其高效的学习能力,从蛋白质序列中提取了一些有用的潜在信息,从而在独立测试数据集上将旧版本的皮尔逊相关系数(CC)值性能从0.58提高到了0.66。结果表明,TMP-SSurface2在预测跨膜蛋白表面方面是有效的,代表了基于一级序列的跨膜蛋白结构建模的新进展。可通过https://github.com/NENUBioCompute/TMP-SSurface-2.0免费访问TMP-SSurface2。