通过在蛋白质家族图谱中导航来选择用于结构测定的靶标。

Selecting targets for structural determination by navigating in a graph of protein families.

作者信息

Portugaly Elon, Kifer Ilona, Linial Michal

机构信息

Institute of Computer Sciences Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University, Jerusalem 91904, Israel.

出版信息

Bioinformatics. 2002 Jul;18(7):899-907. doi: 10.1093/bioinformatics/18.7.899.

DOI:10.1093/bioinformatics/18.7.899

PMID:12117787

Abstract

MOTIVATION

A major goal in structural genomics is to enrich the catalogue of proteins whose 3D structures are known. In an attempt to address this problem we mapped over 10 000 proteins with solved structures onto a graph of all Swissprot protein sequences (release 36, approximately 73 000 proteins) provided by ProtoMap, with the goal of sorting proteins according to their likelihood of belonging to new superfamilies. We hypothesized that proteins within neighbouring clusters tend to share common structural superfamilies or folds. If true, the likelihood of finding new superfamilies increases in clusters that are distal from other solved structures within the graph.

RESULTS

We defined an order relation between unsolved proteins according to their 'distance' from solved structures in the graph, and sorted approximately 48 000 proteins. Our list can be partitioned into three groups: approximately 35 000 proteins sharing a cluster with at least one known structure; approximately 6500 proteins in clusters with no solved structure but with neighbouring clusters containing known structures; and a third group contains the rest of the proteins, approximately 6100 (in 1274 clusters). We tested the quality of the order relation using thousands of recently solved structures that were not included when the order was defined. The tests show that our order is significantly better (P-value approximately 10(5)) than a random order. More interestingly, the order within the union of the second and third groups, and the order within the third group alone, perform better than random (P-values: 0.0008 and 0.15, respectively) and are better than alternative orders created using PSI-BLAST. Herein, we present a method for selecting targets to be used in structural genomics projects.

AVAILABILITY

List of proteins to be used for targets selection combined with a set of biological filters for narrowing down potential targets is in http://www.protarget.cs.huji.ac.il.

摘要

动机

结构基因组学的一个主要目标是丰富已知三维结构的蛋白质目录。为了解决这个问题，我们将一万多种已解析结构的蛋白质映射到由ProtoMap提供的所有Swissprot蛋白质序列（第36版，约73000种蛋白质）构成的图上，目的是根据蛋白质属于新超家族的可能性对其进行分类。我们假设相邻簇内的蛋白质往往共享共同的结构超家族或折叠方式。如果这是真的，那么在图中远离其他已解析结构的簇中发现新超家族的可能性就会增加。

结果

我们根据未解析蛋白质在图中与已解析结构的“距离”定义了一种顺序关系，并对约48000种蛋白质进行了分类。我们的列表可分为三组：约35000种蛋白质与至少一个已知结构共享一个簇；约6500种蛋白质所在的簇中没有已解析结构，但相邻簇中有已知结构；第三组包含其余的蛋白质，约6100种（分布在1274个簇中）。我们使用数千个在定义顺序时未包含的最近解析的结构来测试顺序关系的质量。测试表明，我们的顺序比随机顺序显著更好（P值约为10^(-5)）。更有趣的是，第二组和第三组联合起来的顺序以及仅第三组内的顺序比随机顺序表现更好（P值分别为0.0008和0.15），并且比使用PSI-BLAST创建的替代顺序更好。在此，我们提出了一种用于选择结构基因组学项目中目标的方法。

可用性

用于目标选择的蛋白质列表以及一组用于缩小潜在目标范围的生物学筛选条件可在http://www.protarget.cs.huji.ac.il获取。

相似文献

Selecting targets for structural determination by navigating in a graph of protein families.

Bioinformatics. 2002 Jul;18(7):899-907. doi: 10.1093/bioinformatics/18.7.899.

Target space for structural genomics revisited.

Bioinformatics. 2002 Jul;18(7):922-33. doi: 10.1093/bioinformatics/18.7.922.

Clustering of proximal sequence space for the identification of protein families.

Bioinformatics. 2002 Jul;18(7):908-21. doi: 10.1093/bioinformatics/18.7.908.

Predicting fold novelty based on ProtoNet hierarchical classification.

Bioinformatics. 2005 Apr 1;21(7):1020-7. doi: 10.1093/bioinformatics/bti135. Epub 2004 Nov 11.

Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds.

BMC Struct Biol. 2006 Mar 20;6:6. doi: 10.1186/1472-6807-6-6.

Data mining of sequences and 3D structures of allergenic proteins.

Bioinformatics. 2002 Oct;18(10):1358-64. doi: 10.1093/bioinformatics/18.10.1358.

ProTarget: automatic prediction of protein structure novelty.

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W81-4. doi: 10.1093/nar/gki389.

About the use of protein models.

Bioinformatics. 2002 Jul;18(7):934-8. doi: 10.1093/bioinformatics/18.7.934.

ProtBuD: a database of biological unit structures of protein families and superfamilies.

Bioinformatics. 2006 Dec 1;22(23):2876-82. doi: 10.1093/bioinformatics/btl490. Epub 2006 Oct 2.

Statistically rigorous automated protein annotation.

Bioinformatics. 2004 May 1;20(7):1066-73. doi: 10.1093/bioinformatics/bth039. Epub 2004 Feb 5.

引用本文的文献

Fishing with (Proto)Net-a principled approach to protein target selection.

Comp Funct Genomics. 2003;4(5):542-8. doi: 10.1002/cfg.328.

A functional hierarchical organization of the protein sequence space.

BMC Bioinformatics. 2004 Dec 14;5:196. doi: 10.1186/1471-2105-5-196.

ProtoNet: hierarchical classification of the protein space.

Nucleic Acids Res. 2003 Jan 1;31(1):348-52. doi: 10.1093/nar/gkg096.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过在蛋白质家族图谱中导航来选择用于结构测定的靶标。

Selecting targets for structural determination by navigating in a graph of protein families.

作者信息

Portugaly Elon, Kifer Ilona, Linial Michal

机构信息

Institute of Computer Sciences Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University, Jerusalem 91904, Israel.

出版信息

Bioinformatics. 2002 Jul;18(7):899-907. doi: 10.1093/bioinformatics/18.7.899.

DOI:10.1093/bioinformatics/18.7.899

PMID:12117787

Abstract

MOTIVATION

RESULTS

AVAILABILITY

List of proteins to be used for targets selection combined with a set of biological filters for narrowing down potential targets is in http://www.protarget.cs.huji.ac.il.

摘要

动机

结果

可用性

用于目标选择的蛋白质列表以及一组用于缩小潜在目标范围的生物学筛选条件可在http://www.protarget.cs.huji.ac.il获取。

通过在蛋白质家族图谱中导航来选择用于结构测定的靶标。

Selecting targets for structural determination by navigating in a graph of protein families.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过在蛋白质家族图谱中导航来选择用于结构测定的靶标。

Selecting targets for structural determination by navigating in a graph of protein families.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献