ProCKSI：一种用于蛋白质（结构）比较、知识、相似性和信息的决策支持系统。

ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information.

作者信息

Barthel Daniel, Hirst Jonathan D, Błazewicz Jacek, Burke Edmund K, Krasnogor Natalio

机构信息

ASAP, School of Computer Science and IT, University of Nottingham, Nottingham, NG8 1BB, UK.

出版信息

BMC Bioinformatics. 2007 Oct 26;8:416. doi: 10.1186/1471-2105-8-416.

DOI:10.1186/1471-2105-8-416

PMID:17963510

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2222653/

Abstract

BACKGROUND

We introduce the decision support system for Protein (Structure) Comparison, Knowledge, Similarity and Information (ProCKSI). ProCKSI integrates various protein similarity measures through an easy to use interface that allows the comparison of multiple proteins simultaneously. It employs the Universal Similarity Metric (USM), the Maximum Contact Map Overlap (MaxCMO) of protein structures and other external methods such as the DaliLite and the TM-align methods, the Combinatorial Extension (CE) of the optimal path, and the FAST Align and Search Tool (FAST). Additionally, ProCKSI allows the user to upload a user-defined similarity matrix supplementing the methods mentioned, and computes a similarity consensus in order to provide a rich, integrated, multicriteria view of large datasets of protein structures.

RESULTS

We present ProCKSI's architecture and workflow describing its intuitive user interface, and show its potential on three distinct test-cases. In the first case, ProCKSI is used to evaluate the results of a previous CASP competition, assessing the similarity of proposed models for given targets where the structures could have a large deviation from one another. To perform this type of comparison reliably, we introduce a new consensus method. The second study deals with the verification of a classification scheme for protein kinases, originally derived by sequence comparison by Hanks and Hunter, but here we use a consensus similarity measure based on structures. In the third experiment using the Rost and Sander dataset (RS126), we investigate how a combination of different sets of similarity measures influences the quality and performance of ProCKSI's new consensus measure. ProCKSI performs well with all three datasets, showing its potential for complex, simultaneous multi-method assessment of structural similarity in large protein datasets. Furthermore, combining different similarity measures is usually more robust than relying on one single, unique measure.

CONCLUSION

Based on a diverse set of similarity measures, ProCKSI computes a consensus similarity profile for the entire protein set. All results can be clustered, visualised, analysed and easily compared with each other through a simple and intuitive interface.ProCKSI is publicly available at http://www.procksi.net for academic and non-commercial use.

摘要

背景

我们介绍了蛋白质（结构）比较、知识、相似性与信息决策支持系统（ProCKSI）。ProCKSI通过一个易于使用的界面整合了各种蛋白质相似性度量方法，该界面允许同时比较多个蛋白质。它采用通用相似性度量（USM）、蛋白质结构的最大接触图重叠（MaxCMO）以及其他外部方法，如DaliLite和TM-align方法、最优路径的组合扩展（CE）以及快速比对与搜索工具（FAST）。此外，ProCKSI允许用户上传补充上述方法的用户定义相似性矩阵，并计算相似性共识，以便为大型蛋白质结构数据集提供丰富、综合、多标准的视图。

结果

我们展示了ProCKSI的架构和工作流程，描述了其直观的用户界面，并在三个不同的测试用例中展示了其潜力。在第一个案例中，ProCKSI用于评估先前CASP竞赛的结果，评估针对给定目标提出的模型的相似性，其中结构可能彼此存在较大偏差。为了可靠地进行此类比较，我们引入了一种新的共识方法。第二项研究涉及对蛋白激酶分类方案的验证，该方案最初由Hanks和Hunter通过序列比较得出，但在这里我们使用基于结构的共识相似性度量。在使用Rost和Sander数据集（RS126）的第三个实验中，我们研究了不同相似性度量集的组合如何影响ProCKSI新共识度量的质量和性能。ProCKSI在所有三个数据集上表现良好，显示出其在大型蛋白质数据集中对结构相似性进行复杂、同时多方法评估的潜力。此外，组合不同的相似性度量通常比依赖单一独特度量更稳健。

结论

基于多种相似性度量，ProCKSI为整个蛋白质集计算共识相似性概况。所有结果都可以通过一个简单直观的界面进行聚类、可视化、分析并轻松相互比较。ProCKSI可在http://www.procksi.net上公开获取，供学术和非商业使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b1a/2222653/acdaafb35707/1471-2105-8-416-1.jpg

相似文献

ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information.

BMC Bioinformatics. 2007 Oct 26;8:416. doi: 10.1186/1471-2105-8-416.

Toward high-throughput, multicriteria protein-structure comparison and analysis.

IEEE Trans Nanobioscience. 2010 Jun;9(2):144-55. doi: 10.1109/TNB.2010.2043851.

Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

BMC Bioinformatics. 2007 Jul 13;8:252. doi: 10.1186/1471-2105-8-252.

Classification of rhodopsin structures by modern methods of structural bioinformatics.

Biochemistry (Mosc). 2012 May;77(5):435-43. doi: 10.1134/S0006297912050033.

Measuring the similarity of protein structures by means of the universal similarity metric.

Bioinformatics. 2004 May 1;20(7):1015-21. doi: 10.1093/bioinformatics/bth031. Epub 2004 Jan 29.

Relation between weight matrix and substitution matrix: motif search by similarity.

Bioinformatics. 2005 Apr 1;21(7):938-43. doi: 10.1093/bioinformatics/bti090. Epub 2004 Oct 28.

Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.

PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.

Evaluating protein similarity from coarse structures.

IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):583-93. doi: 10.1109/TCBB.2007.70250.

Protein structure alignment and fast similarity search using local shape signatures.

J Bioinform Comput Biol. 2004 Mar;2(1):215-39. doi: 10.1142/s0219720004000533.

Pcons5: combining consensus, structural evaluation and fold recognition scores.

Bioinformatics. 2005 Dec 1;21(23):4248-54. doi: 10.1093/bioinformatics/bti702. Epub 2005 Oct 4.

引用本文的文献

Fast Phylogeny of SARS-CoV-2 by Compression.

Entropy (Basel). 2022 Mar 22;24(4):439. doi: 10.3390/e24040439.

Capabilities of bioinformatics tools for optimizing physicochemical features of proteins used in Nano biosensors: A short overview of the tools related to bioinformatics.

Biochem Biophys Rep. 2021 Aug 3;27:101094. doi: 10.1016/j.bbrep.2021.101094. eCollection 2021 Sep.

Evolutionary and Molecular Characterization of liver-enriched gene 1.

Sci Rep. 2020 Mar 6;10(1):4262. doi: 10.1038/s41598-020-61208-7.

Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC.

PLoS One. 2018 Oct 17;13(10):e0204587. doi: 10.1371/journal.pone.0204587. eCollection 2018.

In Silico Characterization and Functional Validation of Cell Wall Modification Genes Imparting Waterlogging Tolerance in Maize.

Bioinform Biol Insights. 2017 Dec 19;11:1177932217747277. doi: 10.1177/1177932217747277. eCollection 2017.

GP0.4 from bacteriophage T7: in silico characterisation of its structure and interaction with E. coli FtsZ.

BMC Res Notes. 2016 Jul 13;9:343. doi: 10.1186/s13104-016-2149-5.

An effective sequence-alignment-free superpositioning of pairwise or multiple structures with missing data.

Algorithms Mol Biol. 2016 Jun 21;11:18. doi: 10.1186/s13015-016-0079-3. eCollection 2016.

Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

Biomed Res Int. 2015;2015:563674. doi: 10.1155/2015/563674. Epub 2015 Oct 28.

Retargeting of the Bacillus thuringiensis toxin Cyt2Aa against hemipteran insect pests.

Proc Natl Acad Sci U S A. 2013 May 21;110(21):8465-70. doi: 10.1073/pnas.1222144110. Epub 2013 May 6.

A knowledge-based decision support system in bioinformatics: an application to protein complex extraction.

BMC Bioinformatics. 2013;14 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2105-14-S1-S5. Epub 2013 Jan 14.

本文引用的文献

The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection.

Nucleic Acids Res. 2016 Jan 4;44(D1):D1-6. doi: 10.1093/nar/gkv1356.

Total evidence, average consensus and matrix representation with parsimony: what a difference distances make.

Evol Bioinform Online. 2007 Feb 13;2:1-5.

Con-Struct Map: a comparative contact map analysis tool.

Bioinformatics. 2007 Sep 15;23(18):2491-2. doi: 10.1093/bioinformatics/btm356. Epub 2007 Aug 20.

Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

BMC Bioinformatics. 2007 Jul 13;8:252. doi: 10.1186/1471-2105-8-252.

EVEREST: a collection of evolutionary conserved protein domains.

Nucleic Acids Res. 2007 Jan;35(Database issue):D241-6. doi: 10.1093/nar/gkl850. Epub 2006 Nov 11.

EVEREST: automatic identification and classification of protein domains in all protein sequences.

BMC Bioinformatics. 2006 Jun 2;7:277. doi: 10.1186/1471-2105-7-277.

Integrating multi-attribute similarity networks for robust representation of the protein space.

Bioinformatics. 2006 Jul 1;22(13):1585-92. doi: 10.1093/bioinformatics/btl130. Epub 2006 Apr 4.

Servers for protein structure prediction.

Curr Opin Struct Biol. 2006 Apr;16(2):178-82. doi: 10.1016/j.sbi.2006.03.004. Epub 2006 Mar 20.

Flexible secondary structure based protein structure comparison applied to the detection of circular permutation.

J Comput Biol. 2006 Jan-Feb;13(1):43-63. doi: 10.1089/cmb.2006.13.43.

The Molecular Biology Database Collection: 2006 update.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D3-5. doi: 10.1093/nar/gkj162.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ProCKSI：一种用于蛋白质（结构）比较、知识、相似性和信息的决策支持系统。

ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献