Swiss-Prot Group, Swiss Institute of Bioinformatics, 1 rue Michel Servet, 1211, Geneva, Switzerland.
Cell Mol Life Sci. 2010 Apr;67(7):1049-64. doi: 10.1007/s00018-009-0229-6. Epub 2009 Dec 31.
With the dramatic increase in the volume of experimental results in every domain of life sciences, assembling pertinent data and combining information from different fields has become a challenge. Information is dispersed over numerous specialized databases and is presented in many different formats. Rapid access to experiment-based information about well-characterized proteins helps predict the function of uncharacterized proteins identified by large-scale sequencing. In this context, universal knowledgebases play essential roles in providing access to data from complementary types of experiments and serving as hubs with cross-references to many specialized databases. This review outlines how the value of experimental data is optimized by combining high-quality protein sequences with complementary experimental results, including information derived from protein 3D-structures, using as an example the UniProt knowledgebase (UniProtKB) and the tools and links provided on its website ( http://www.uniprot.org/ ). It also evokes precautions that are necessary for successful predictions and extrapolations.
随着生命科学各个领域实验结果数量的急剧增加,收集相关数据并整合来自不同领域的信息已成为一项挑战。信息分散在众多专业数据库中,呈现出多种不同的格式。快速获取关于特征明确的蛋白质的基于实验的信息有助于预测通过大规模测序鉴定的特征不明确的蛋白质的功能。在这种情况下,通用知识库在提供对来自互补类型实验的数据的访问以及充当枢纽并与许多专业数据库交叉引用方面发挥着重要作用。本文通过将高质量的蛋白质序列与互补的实验结果(包括来自蛋白质 3D 结构的信息)相结合,概述了如何优化实验数据的价值,以 UniProt 知识库(UniProtKB)为例,并介绍了其网站上提供的工具和链接(http://www.uniprot.org/)。本文还提到了成功预测和推断所需的注意事项。