Zhao Yingwen, Wang Jun, Chen Jian, Zhang Xiangliang, Guo Maozu, Yu Guoxian
College of Computer and Information Science, Southwest University, Chongqing, China.
State Key Laboratory of Agrobiotechnology and National Maize Improvement Center, China Agricultural University, Beijing, China.
Front Genet. 2020 Apr 24;11:400. doi: 10.3389/fgene.2020.00400. eCollection 2020.
Annotating the functional properties of gene products, i.e., RNAs and proteins, is a fundamental task in biology. The Gene Ontology database (GO) was developed to systematically describe the functional properties of gene products across species, and to facilitate the computational prediction of gene function. As GO is routinely updated, it serves as the gold standard and main knowledge source in functional genomics. Many gene function prediction methods making use of GO have been proposed. But no literature review has summarized these methods and the possibilities for future efforts from the perspective of GO. To bridge this gap, we review the existing methods with an emphasis on recent solutions. First, we introduce the conventions of GO and the widely adopted evaluation metrics for gene function prediction. Next, we summarize current methods of gene function prediction that apply GO in different ways, such as using hierarchical or flat inter-relationships between GO terms, compressing massive GO terms and quantifying semantic similarities. Although many efforts have improved performance by harnessing GO, we conclude that there remain many largely overlooked but important topics for future research.
注释基因产物(即RNA和蛋白质)的功能特性是生物学中的一项基本任务。基因本体数据库(Gene Ontology database,GO)旨在系统地描述跨物种基因产物的功能特性,并促进基因功能的计算预测。由于GO会定期更新,它是功能基因组学中的金标准和主要知识来源。许多利用GO的基因功能预测方法已经被提出。但是还没有文献综述从GO的角度总结这些方法以及未来研究的可能性。为了弥补这一差距,我们回顾了现有方法,重点关注近期的解决方案。首先,我们介绍GO的惯例以及广泛采用的基因功能预测评估指标。接下来,我们总结当前以不同方式应用GO的基因功能预测方法,例如使用GO术语之间的层次或平面相互关系、压缩大量GO术语以及量化语义相似性。尽管许多研究通过利用GO提高了性能,但我们得出结论,未来研究仍有许多在很大程度上被忽视但很重要的课题。