Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.
Program in Bioinformatics and Computational Biology, Ames, IA, USA.
Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.
The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.
Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.
We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
功能注释的关键评估(CAFA)是一个正在进行的、全球性的、由社区驱动的努力,旨在评估和改进蛋白质功能的计算注释。
在这里,我们报告了第三次 CAFA 挑战(CAFA3)的结果,与之前的 CAFA 轮次相比,无论是在分析数据的数量还是执行的分析类型方面,都进行了扩展分析。在一个新颖且重大的新发展中,计算预测和评估目标推动了一些实验测定,为 1000 多个基因提供了新的功能注释。具体来说,我们在白色念珠菌和铜绿假单胞菌基因组中进行了全基因组突变筛选实验,为与生物膜形成和运动相关的基因提供了全基因组实验数据。我们还在果蝇中对选定的基因进行了靶向测定,我们怀疑这些基因与长期记忆有关。
我们得出的结论是,尽管分子功能和生物学过程注释的预测随着时间的推移略有改善,但细胞成分的预测并没有改善。实验注释的基于术语的预测仍然具有挑战性;尽管在白色念珠菌和果蝇中,顶级方法的性能明显优于基线方法设定的预期,但仍有相当大的改进空间和需求。最后,我们报告说,CAFA 社区现在涉及到广泛的参与者,他们在生物信息学、生物实验、生物注释和生物本体方面具有专业知识,共同努力提高功能注释、计算功能预测以及我们在大型实验筛选时代管理大数据的能力。