School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China; Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, 100081, China.
Comput Biol Med. 2022 Oct;149:105938. doi: 10.1016/j.compbiomed.2022.105938. Epub 2022 Aug 20.
Protein function prediction is one of the most critical tasks in bioinformatics. The computational predictors that can accurately predict the protein functions from their sequences are highly desired. With the development of the protein structure prediction methods, it is interesting to explore a new approach to use the predicted protein structures to improve the predictive performance of protein function prediction. TALE is a successful sequence-based method for protein function prediction. Therefore, in this study, we employed the TALE-based architecture to integrate sequence embeddings, contact map embeddings, and GO label embeddings to predict protein functions. These embeddings represent the proteins at the sequence, structure, and function levels. The TALE-cmap predictor outperforms the other state-of-the-art methods, indicating that structural information is essential for protein function prediction.
蛋白质功能预测是生物信息学中最重要的任务之一。人们非常希望能够开发出计算预测器,以便根据蛋白质序列准确预测其功能。随着蛋白质结构预测方法的发展,探索一种新的方法,利用预测的蛋白质结构来提高蛋白质功能预测的预测性能,是很有趣的。TALE 是一种成功的基于序列的蛋白质功能预测方法。因此,在本研究中,我们采用基于 TALE 的架构来整合序列嵌入、接触图嵌入和 GO 标签嵌入,以预测蛋白质功能。这些嵌入代表了蛋白质在序列、结构和功能层面的信息。TALE-cmap 预测器的性能优于其他最先进的方法,这表明结构信息对于蛋白质功能预测至关重要。