Suppr超能文献

使用 pyMCPSC 进行多标准蛋白质结构比较和结构相似性分析。

Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC.

机构信息

Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens, Greece.

Northeastern University, Boston, Massachusetts, United States of America.

出版信息

PLoS One. 2018 Oct 17;13(10):e0204587. doi: 10.1371/journal.pone.0204587. eCollection 2018.

Abstract

Protein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest from the research community, since it is widely used in structural biology and drug discovery. With new PSC methods continuously emerging and no clear method of choice, Multi-Criteria Protein Structure Comparison (MCPSC) is commonly employed to combine methods and generate consensus structural similarity scores. We present pyMCPSC, a Python based utility we developed to allow users to perform MCPSC efficiently, by exploiting the parallelism afforded by the multi-core CPUs of today's desktop computers. We show how pyMCPSC facilitates the analysis of similarities in protein domain datasets and how it can be extended to incorporate new PSC methods as they are becoming available. We exemplify the power of pyMCPSC using a case study based on the Proteus_300 dataset. Results generated using pyMCPSC show that MCPSC scores form a reliable basis for identifying the true classification of a domain, as evidenced both by the ROC analysis as well as the Nearest-Neighbor analysis. Structure similarity based "Phylogenetic Trees" representation generated by pyMCPSC provide insight into functional grouping within the dataset of domains. Furthermore, scatter plots generated by pyMCPSC show the existence of strong correlation between protein domains belonging to SCOP Class C and loose correlation between those of SCOP Class D. Such analyses and corresponding visualizations help users quickly gain insights about their datasets. The source code of pyMCPSC is available under the GPLv3.0 license through a GitHub repository (https://github.com/xulesc/pymcpsc).

摘要

蛋白质结构比较(PSC)是计算蛋白质组学中一个发展成熟的领域,受到研究界的广泛关注,因为它在结构生物学和药物发现中得到了广泛应用。随着新的 PSC 方法不断涌现,而且没有明确的选择方法,多标准蛋白质结构比较(MCPSC)通常被用来结合方法并生成共识结构相似性评分。我们介绍了 pyMCPSC,这是一个基于 Python 的工具,我们开发它是为了让用户能够有效地进行 MCPSC,利用当今桌面计算机多核 CPU 提供的并行性。我们展示了 pyMCPSC 如何方便地分析蛋白质域数据集的相似性,以及如何扩展它以纳入新的 PSC 方法,因为它们正在变得可用。我们通过基于 Proteus_300 数据集的案例研究说明了 pyMCPSC 的强大功能。使用 pyMCPSC 生成的结果表明,MCPSC 评分为识别域的真实分类提供了可靠的基础,这一点既可以通过 ROC 分析,也可以通过最近邻分析来证明。pyMCPSC 生成的基于结构相似性的“系统发育树”表示形式为数据集中的功能分组提供了深入的了解。此外,pyMCPSC 生成的散点图显示了 SCOP 类 C 的蛋白质域之间存在很强的相关性,而 SCOP 类 D 的蛋白质域之间则存在松散的相关性。这些分析和相应的可视化帮助用户快速了解他们的数据。pyMCPSC 的源代码可在 GPLv3.0 许可证下通过 GitHub 存储库(https://github.com/xulesc/pymcpsc)获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/273c/6192565/ff75b1e9ac98/pone.0204587.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验