Dessailly Benoît H, Nair Rajesh, Jaroszewski Lukasz, Fajardo J Eduardo, Kouranov Andrei, Lee David, Fiser Andras, Godzik Adam, Rost Burkhard, Orengo Christine
Department of Structural and Molecular Biology, University College of London, London WC1E6BT, UK.
Structure. 2009 Jun 10;17(6):869-81. doi: 10.1016/j.str.2009.03.015.
One major objective of structural genomics efforts, including the NIH-funded Protein Structure Initiative (PSI), has been to increase the structural coverage of protein sequence space. Here, we present the target selection strategy used during the second phase of PSI (PSI-2). This strategy, jointly devised by the bioinformatics groups associated with the PSI-2 large-scale production centers, targets representatives from large, structurally uncharacterized protein domain families, and from structurally uncharacterized subfamilies in very large and diverse families with incomplete structural coverage. These very large families are extremely diverse both structurally and functionally, and are highly overrepresented in known proteomes. On the basis of several metrics, we then discuss to what extent PSI-2, during its first 3 years, has increased the structural coverage of genomes, and contributed structural and functional novelty. Together, the results presented here suggest that PSI-2 is successfully meeting its objectives and provides useful insights into structural and functional space.
包括美国国立卫生研究院资助的蛋白质结构计划(PSI)在内的结构基因组学研究的一个主要目标,是增加蛋白质序列空间的结构覆盖范围。在此,我们展示了PSI第二阶段(PSI-2)所采用的靶点选择策略。该策略由与PSI-2大规模生产中心相关的生物信息学团队共同设计,目标是来自大型、结构未表征的蛋白质结构域家族,以及来自结构覆盖不完整的非常大且多样的家族中的结构未表征亚家族的代表。这些非常大的家族在结构和功能上极其多样,并且在已知蛋白质组中高度富集。基于多项指标,我们接着讨论了PSI-2在其头3年里在多大程度上增加了基因组的结构覆盖范围,并贡献了结构和功能上的新颖性。总体而言,此处呈现的结果表明PSI-2正在成功实现其目标,并为结构和功能空间提供了有用的见解。