Department of Biomedical Sciences, University of Padua, Padua, Italy.
CNR Institute of Neuroscience, Padua, Italy.
Nucleic Acids Res. 2019 Jul 2;47(W1):W373-W378. doi: 10.1093/nar/gkz375.
Our current knowledge of complex biological systems is stored in a computable form through the Gene Ontology (GO) which provides a comprehensive description of genes function. Prediction of GO terms from the sequence remains, however, a challenging task, which is particularly critical for novel genomes. Here we present INGA 2.0, a new version of the INGA software for protein function prediction. INGA exploits homology, domain architecture, interaction networks and information from the 'dark proteome', like transmembrane and intrinsically disordered regions, to generate a consensus prediction. INGA was ranked in the top ten methods on both CAFA2 and CAFA3 blind tests. The new algorithm can process entire genomes in a few hours or even less when additional input files are provided. The new interface provides a better user experience by integrating filters and widgets to explore the graph structure of the predicted terms. The INGA web server, databases and benchmarking are available from URL: https://inga.bio.unipd.it/.
我们目前对复杂生物系统的了解是通过基因本体论(GO)以可计算的形式存储的,GO 为基因功能提供了全面的描述。然而,从序列中预测 GO 术语仍然是一项具有挑战性的任务,特别是对于新基因组而言。在这里,我们介绍了 INGA 2.0,这是一种用于蛋白质功能预测的 INGA 软件的新版本。INGA 利用同源性、结构域架构、相互作用网络以及来自“暗蛋白质组”的信息,如跨膜和固有无序区域,生成一致的预测。INGA 在 CAFA2 和 CAFA3 盲测中均排名前十。当提供其他输入文件时,新算法可以在数小时甚至更短的时间内处理整个基因组。新界面通过集成筛选器和小部件来探索预测术语的图形结构,提供了更好的用户体验。INGA 网络服务器、数据库和基准测试可从以下网址获得:https://inga.bio.unipd.it/。