Vu Thi Thuy Duong, Jung Jaehee
Department of Information and Communication Engineering, Myongji University, Yongin-si, Gyeonggi-do, South Korea.
PeerJ. 2021 Aug 24;9:e12019. doi: 10.7717/peerj.12019. eCollection 2021.
Protein function prediction is a crucial part of genome annotation. Prediction methods have recently witnessed rapid development, owing to the emergence of high-throughput sequencing technologies. Among the available databases for identifying protein function terms, Gene Ontology (GO) is an important resource that describes the functional properties of proteins. Researchers are employing various approaches to efficiently predict the GO terms. Meanwhile, deep learning, a fast-evolving discipline in data-driven approach, exhibits impressive potential with respect to assigning GO terms to amino acid sequences. Herein, we reviewed the currently available computational GO annotation methods for proteins, ranging from conventional to deep learning approach. Further, we selected some suitable predictors from among the reviewed tools and conducted a mini comparison of their performance using a worldwide challenge dataset. Finally, we discussed the remaining major challenges in the field, and emphasized the future directions for protein function prediction with GO.
蛋白质功能预测是基因组注释的关键部分。由于高通量测序技术的出现,预测方法近年来得到了快速发展。在用于识别蛋白质功能术语的现有数据库中,基因本体论(GO)是描述蛋白质功能特性的重要资源。研究人员正在采用各种方法来高效预测GO术语。与此同时,深度学习作为数据驱动方法中快速发展的学科,在将GO术语分配给氨基酸序列方面展现出了令人印象深刻的潜力。在此,我们综述了目前可用的蛋白质计算GO注释方法,从传统方法到深度学习方法。此外,我们从综述的工具中选择了一些合适的预测器,并使用一个全球挑战数据集对它们的性能进行了小型比较。最后,我们讨论了该领域仍然存在的主要挑战,并强调了利用GO进行蛋白质功能预测的未来方向。