• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质结构比较中的结构足迹分析:结构片段的影响

Structural footprinting in protein structure comparison: the impact of structural fragments.

作者信息

Zotenko Elena, Dogan Rezarta Islamaj, Wilbur W John, O'Leary Dianne P, Przytycka Teresa M

机构信息

Department of Computer Science, University of Maryland, College Park, MD 20742, USA.

出版信息

BMC Struct Biol. 2007 Aug 9;7:53. doi: 10.1186/1472-6807-7-53.

DOI:10.1186/1472-6807-7-53
PMID:17688700
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2082327/
Abstract

BACKGROUND

One approach for speeding-up protein structure comparison is the projection approach, where a protein structure is mapped to a high-dimensional vector and structural similarity is approximated by distance between the corresponding vectors. Structural footprinting methods are projection methods that employ the same general technique to produce the mapping: first select a representative set of structural fragments as models and then map a protein structure to a vector in which each dimension corresponds to a particular model and "counts" the number of times the model appears in the structure. The main difference between any two structural footprinting methods is in the set of models they use; in fact a large number of methods can be generated by varying the type of structural fragments used and the amount of detail in their representation. How do these choices affect the ability of the method to detect various types of structural similarity?

RESULTS

To answer this question we benchmarked three structural footprinting methods that vary significantly in their selection of models against the CATH database. In the first set of experiments we compared the methods' ability to detect structural similarity characteristic of evolutionarily related structures, i.e., structures within the same CATH superfamily. In the second set of experiments we tested the methods' agreement with the boundaries imposed by classification groups at the Class, Architecture, and Fold levels of the CATH hierarchy.

CONCLUSION

In both experiments we found that the method which uses secondary structure information has the best performance on average, but no one method performs consistently the best across all groups at a given classification level. We also found that combining the methods' outputs significantly improves the performance. Moreover, our new techniques to measure and visualize the methods' agreement with the CATH hierarchy, including the threshholded affinity graph, are useful beyond this work. In particular, they can be used to expose a similar composition of different classification groups in terms of structural fragments used by the method and thus provide an alternative demonstration of the continuous nature of the protein structure universe.

摘要

背景

加速蛋白质结构比较的一种方法是投影法,即将蛋白质结构映射到高维向量,并通过相应向量之间的距离来近似结构相似性。结构足迹法是采用相同通用技术进行映射的投影法:首先选择一组具有代表性的结构片段作为模型,然后将蛋白质结构映射到一个向量中,其中每个维度对应一个特定模型,并“计算”该模型在结构中出现的次数。任何两种结构足迹法之间的主要区别在于它们所使用的模型集;实际上,通过改变所使用的结构片段类型及其表示中的细节量,可以生成大量方法。这些选择如何影响该方法检测各种类型结构相似性的能力?

结果

为了回答这个问题,我们针对CATH数据库对三种在模型选择上有显著差异的结构足迹法进行了基准测试。在第一组实验中,我们比较了这些方法检测进化相关结构(即同一CATH超家族内的结构)所特有的结构相似性的能力。在第二组实验中,我们测试了这些方法与CATH层次结构的类、架构和折叠级别上分类组所划定边界的一致性。

结论

在这两组实验中,我们发现使用二级结构信息的方法平均表现最佳,但在给定分类级别下,没有一种方法在所有组中始终表现最佳。我们还发现,将这些方法的输出结果相结合可显著提高性能。此外,我们用于测量和可视化这些方法与CATH层次结构一致性的新技术,包括阈值亲和图,在这项工作之外也很有用。特别是,它们可用于揭示不同分类组在方法所使用的结构片段方面的相似组成,从而为蛋白质结构宇宙的连续性提供另一种证明。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/b66272174752/1472-6807-7-53-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/cea98289239e/1472-6807-7-53-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/640f979b8b6c/1472-6807-7-53-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/c3001c8069c2/1472-6807-7-53-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/f1e87c745111/1472-6807-7-53-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/3116931e1708/1472-6807-7-53-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/b66272174752/1472-6807-7-53-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/cea98289239e/1472-6807-7-53-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/640f979b8b6c/1472-6807-7-53-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/c3001c8069c2/1472-6807-7-53-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/f1e87c745111/1472-6807-7-53-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/3116931e1708/1472-6807-7-53-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d669/2082327/b66272174752/1472-6807-7-53-6.jpg

相似文献

1
Structural footprinting in protein structure comparison: the impact of structural fragments.蛋白质结构比较中的结构足迹分析:结构片段的影响
BMC Struct Biol. 2007 Aug 9;7:53. doi: 10.1186/1472-6807-7-53.
2
Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification.二级结构空间构象足迹:一种快速蛋白质结构比较与分类的新方法。
BMC Struct Biol. 2006 Jun 8;6:12. doi: 10.1186/1472-6807-6-12.
3
Structural alphabets for protein structure classification: a comparison study.用于蛋白质结构分类的结构字母表:一项比较研究。
J Mol Biol. 2009 Mar 27;387(2):431-50. doi: 10.1016/j.jmb.2008.12.044. Epub 2008 Dec 25.
4
Quantifying structure-function uncertainty: a graph theoretical exploration into the origins and limitations of protein annotation.量化结构-功能不确定性:对蛋白质注释起源与局限性的图论探索
J Mol Biol. 2004 Apr 2;337(4):933-49. doi: 10.1016/j.jmb.2004.02.009.
5
Structural diversity of domain superfamilies in the CATH database.CATH数据库中结构域超家族的结构多样性。
J Mol Biol. 2006 Jul 14;360(3):725-41. doi: 10.1016/j.jmb.2006.05.035. Epub 2006 Jun 2.
6
Assessing strategies for improved superfamily recognition.评估用于改进超家族识别的策略。
Protein Sci. 2005 Jul;14(7):1800-10. doi: 10.1110/ps.041056105. Epub 2005 Jun 3.
7
Current status of membrane protein structure classification.膜蛋白结构分类的现状。
Proteins. 2010 May 15;78(7):1760-73. doi: 10.1002/prot.22692.
8
What are the baselines for protein fold recognition?蛋白质折叠识别的基线是什么?
Bioinformatics. 2001 Jan;17(1):63-72. doi: 10.1093/bioinformatics/17.1.63.
9
The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.CATH结构域数据库以及相关资源Gene3D和DHS为基因组分析提供了全面的结构域家族信息。
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D247-51. doi: 10.1093/nar/gki024.
10
CATH--a hierarchic classification of protein domain structures.CATH——蛋白质结构域结构的层次分类。
Structure. 1997 Aug 15;5(8):1093-108. doi: 10.1016/s0969-2126(97)00260-8.

引用本文的文献

1
A method of protein model classification and retrieval using bag-of-visual-features.一种使用视觉特征袋进行蛋白质模型分类和检索的方法。
Comput Math Methods Med. 2014;2014:269394. doi: 10.1155/2014/269394. Epub 2014 Sep 1.
2
Assessment of CASP10 contact-assisted predictions.对半胱天冬酶10接触辅助预测的评估。
Proteins. 2014 Feb;82 Suppl 2(Suppl 2):84-97. doi: 10.1002/prot.24367. Epub 2013 Oct 17.
3
Progress in the PRIDE technique for rapidly comparing protein three-dimensional structures.用于快速比较蛋白质三维结构的PRIDE技术的进展。

本文引用的文献

1
Structural diversity of domain superfamilies in the CATH database.CATH数据库中结构域超家族的结构多样性。
J Mol Biol. 2006 Jul 14;360(3):725-41. doi: 10.1016/j.jmb.2006.05.035. Epub 2006 Jun 2.
2
Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification.二级结构空间构象足迹:一种快速蛋白质结构比较与分类的新方法。
BMC Struct Biol. 2006 Jun 8;6:12. doi: 10.1186/1472-6807-6-12.
3
Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching.
BMC Res Notes. 2008 Jul 11;1:44. doi: 10.1186/1756-0500-1-44.
使用受试者工作特征(ROC)分析来评估序列匹配。
Comput Chem. 1996 Mar;20(1):25-33. doi: 10.1016/s0097-8485(96)80004-0.
4
ROC and confusion analysis of structure comparison methods identify the main causes of divergence from manual protein classification.结构比较方法的ROC和混淆分析确定了与手动蛋白质分类存在差异的主要原因。
BMC Bioinformatics. 2006 Apr 13;7:206. doi: 10.1186/1471-2105-7-206.
5
Connecting the protein structure universe by using sparse recurring fragments.通过使用稀疏重复片段连接蛋白质结构全域。
Structure. 2005 Aug;13(8):1213-24. doi: 10.1016/j.str.2005.05.009.
6
Efficient recognition of folds in protein 3D structures by the improved PRIDE algorithm.通过改进的PRIDE算法高效识别蛋白质三维结构中的折叠。
Bioinformatics. 2005 Aug 1;21(15):3322-3. doi: 10.1093/bioinformatics/bti513. Epub 2005 May 24.
7
Conservation and specialization in PAS domain dynamics.PAS结构域动力学中的保守性与特异性
Protein Eng Des Sel. 2005 Mar;18(3):127-37. doi: 10.1093/protein/gzi017. Epub 2005 Apr 8.
8
A simple topological representation of protein structure: implications for new, fast, and robust structural classification.蛋白质结构的一种简单拓扑表示:对新型、快速且稳健的结构分类的启示。
Proteins. 2004 Aug 15;56(3):487-501. doi: 10.1002/prot.20146.
9
Local feature frequency profile: a method to measure structural similarity in proteins.局部特征频率分布图:一种测量蛋白质结构相似性的方法。
Proc Natl Acad Sci U S A. 2004 Mar 16;101(11):3797-802. doi: 10.1073/pnas.0308656100. Epub 2004 Feb 25.
10
A new family of global protein shape descriptors.一类新的全局蛋白质形状描述符。
Math Biosci. 2003 Apr;182(2):167-81. doi: 10.1016/s0025-5564(02)00216-x.