蛋白质家族功能注释的路线图:社区视角。

A roadmap for the functional annotation of protein families: a community perspective.

机构信息

Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA.

Genetics Institute, University of Florida, Gainesville, FL 32611, USA.

出版信息

Database (Oxford). 2022 Aug 12;2022. doi: 10.1093/database/baac062.

Abstract

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.

摘要

在过去的 25 年中,生物学已经进入了基因组时代,正在成为一门“大数据”科学。目前已测序的超过 50 万基因组的蛋白质的准确功能注释,是大多数基因组分析解释的基础。据不同估计,只有一半预测的测序蛋白质具有准确的功能注释,并且不同生物谱系之间的这一比例差异巨大。如此大的知识差距阻碍了生物企业的各个方面,从而阻碍了基因组生物学充分发挥其潜力。由美国国家科学基金会资助的一次专门针对这一问题的头脑风暴会议于 2022 年 2 月 3 日至 4 日举行。将数据科学家、生物注释员、计算生物学家和实验人员聚集在同一个场所,使我们能够全面评估蛋白质家族功能注释的当前状况。此外,还确定并讨论了阻碍该领域发展的主要问题,最终提出了如何推进的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/014b/9374478/61512559b5e9/baac062f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索