• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用图上的扩散核来分配蛋白质中的结构域。

Assignment of structural domains in proteins using diffusion kernels on graphs.

机构信息

Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran.

Department of Biophysics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran.

出版信息

BMC Bioinformatics. 2022 Sep 8;23(1):369. doi: 10.1186/s12859-022-04902-9.

DOI:10.1186/s12859-022-04902-9
PMID:36076174
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9461149/
Abstract

Though proposing algorithmic approaches for protein domain decomposition has been of high interest, the inherent ambiguity to the problem makes it still an active area of research. Besides, accurate automated methods are in high demand as the number of solved structures for complex proteins is on the rise. While majority of the previous efforts for decomposition of 3D structures are centered on the developing clustering algorithms, employing enhanced measures of proximity between the amino acids has remained rather uncharted. If there exists a kernel function that in its reproducing kernel Hilbert space, structural domains of proteins become well separated, then protein structures can be parsed into domains without the need to use a complex clustering algorithm. Inspired by this idea, we developed a protein domain decomposition method based on diffusion kernels on protein graphs. We examined all combinations of four graph node kernels and two clustering algorithms to investigate their capability to decompose protein structures. The proposed method is tested on five of the most commonly used benchmark datasets for protein domain assignment plus a comprehensive non-redundant dataset. The results show a competitive performance of the method utilizing one of the diffusion kernels compared to four of the best automatic methods. Our method is also able to offer alternative partitionings for the same structure which is in line with the subjective definition of protein domain. With a competitive accuracy and balanced performance for the simple and complex structures despite relying on a relatively naive criterion to choose optimal decomposition, the proposed method revealed that diffusion kernels on graphs in particular, and kernel functions in general are promising measures to facilitate parsing proteins into domains and performing different structural analysis on proteins. The size and interconnectedness of the protein graphs make them promising targets for diffusion kernels as measures of affinity between amino acids. The versatility of our method allows the implementation of future kernels with higher performance. The source code of the proposed method is accessible at https://github.com/taherimo/kludo . Also, the proposed method is available as a web application from https://cbph.ir/tools/kludo .

摘要

虽然提出蛋白质结构域分解的算法方法一直受到高度关注,但该问题固有的模糊性使其仍然是一个活跃的研究领域。此外,由于复杂蛋白质的已解决结构数量不断增加,因此对准确的自动化方法的需求也很高。虽然以前大多数用于 3D 结构分解的工作都集中在开发聚类算法上,但氨基酸之间的接近度的增强度量仍然是未知的。如果存在一个核函数,其在其再生核希尔伯特空间中,蛋白质的结构域变得很好地分离,那么蛋白质结构可以被解析为域,而无需使用复杂的聚类算法。受此启发,我们基于蛋白质图上的扩散核开发了一种蛋白质结构域分解方法。我们研究了四种图节点核和两种聚类算法的所有组合,以研究它们分解蛋白质结构的能力。该方法在五个最常用的蛋白质结构域分配基准数据集和一个全面的非冗余数据集上进行了测试。结果表明,与四种最佳自动方法中的一种相比,利用一种扩散核的方法具有竞争力。该方法还能够为同一结构提供替代分区,这与蛋白质结构域的主观定义一致。尽管依赖于相对简单的标准来选择最佳分解,但该方法的准确性和性能都很有竞争力,适用于简单和复杂的结构,表明图上的扩散核,特别是核函数是一种很有前途的方法,可以促进将蛋白质解析为域,并对蛋白质进行不同的结构分析。蛋白质图的大小和互连性使它们成为扩散核作为氨基酸亲和力度量的有前途的目标。该方法的多功能性允许实现具有更高性能的未来核。该方法的源代码可在 https://github.com/taherimo/kludo 上获得。此外,该方法还可以通过 https://cbph.ir/tools/kludo 作为网络应用程序使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/99438b8103e6/12859_2022_4902_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/85739c436d6e/12859_2022_4902_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/778c0256efc2/12859_2022_4902_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/8a25704e0f44/12859_2022_4902_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/dd29b4f6414b/12859_2022_4902_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/4cde566e406d/12859_2022_4902_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/21c227a14e4e/12859_2022_4902_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/ae1b52bf278b/12859_2022_4902_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/0e4f299bb30b/12859_2022_4902_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/d3bd14fe8e78/12859_2022_4902_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/22ce3af908d3/12859_2022_4902_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/99438b8103e6/12859_2022_4902_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/85739c436d6e/12859_2022_4902_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/778c0256efc2/12859_2022_4902_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/8a25704e0f44/12859_2022_4902_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/dd29b4f6414b/12859_2022_4902_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/4cde566e406d/12859_2022_4902_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/21c227a14e4e/12859_2022_4902_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/ae1b52bf278b/12859_2022_4902_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/0e4f299bb30b/12859_2022_4902_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/d3bd14fe8e78/12859_2022_4902_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/22ce3af908d3/12859_2022_4902_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d7/9461149/99438b8103e6/12859_2022_4902_Fig11_HTML.jpg

相似文献

1
Assignment of structural domains in proteins using diffusion kernels on graphs.使用图上的扩散核来分配蛋白质中的结构域。
BMC Bioinformatics. 2022 Sep 8;23(1):369. doi: 10.1186/s12859-022-04902-9.
2
Exploiting graph kernels for high performance biomedical relation extraction.利用图核进行高性能生物医学关系提取。
J Biomed Semantics. 2018 Jan 30;9(1):7. doi: 10.1186/s13326-017-0168-3.
3
Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.基于邻域聚类核的谱聚类对称性搜索远程同源性。
PLoS One. 2013;8(2):e46468. doi: 10.1371/journal.pone.0046468. Epub 2013 Feb 15.
4
Neighborhood Preserving Kernels for Attributed Graphs.用于属性图的邻域保持核
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):828-840. doi: 10.1109/TPAMI.2022.3143806. Epub 2022 Dec 5.
5
A Comprehensive Evaluation of Graph Kernels for Unattributed Graphs.无属性图的图核综合评估
Entropy (Basel). 2018 Dec 18;20(12):984. doi: 10.3390/e20120984.
6
Late Fusion Multiple Kernel Clustering With Proxy Graph Refinement.基于代理图优化的晚期融合多核聚类
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4359-4370. doi: 10.1109/TNNLS.2021.3117403. Epub 2023 Aug 4.
7
An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification.基于图核的协同推荐和半监督分类的实验研究。
Neural Netw. 2012 Jul;31:53-72. doi: 10.1016/j.neunet.2012.03.001. Epub 2012 Mar 20.
8
Context-Dependent Random Walk Graph Kernels and Tree Pattern Graph Matching Kernels with Applications to Action Recognition.上下文相关随机游走图核与树模式图匹配核及其在动作识别中的应用
IEEE Trans Image Process. 2018 Jun 22. doi: 10.1109/TIP.2018.2849885.
9
dConsensus: a tool for displaying domain assignments by multiple structure-based algorithms and for construction of a consensus assignment.共识:一种用于显示基于多种结构算法的结构域分配的工具,以及一种用于构建共识分配的工具。
BMC Bioinformatics. 2010 Jun 9;11:310. doi: 10.1186/1471-2105-11-310.
10
Automatic classification of protein structures relying on similarities between alignments.基于比对间相似性的蛋白质结构自动分类。
BMC Bioinformatics. 2012 Sep 14;13:233. doi: 10.1186/1471-2105-13-233.

本文引用的文献

1
tRNA Dissociation from EF-Tu after GTP Hydrolysis: Primary Steps and Antibiotic Inhibition.tRNA 从 EF-Tu 上的 GTP 水解后解离:主要步骤和抗生素抑制。
Biophys J. 2020 Jan 7;118(1):151-161. doi: 10.1016/j.bpj.2019.10.028. Epub 2019 Oct 28.
2
SCOPe: classification of large macromolecular structures in the structural classification of proteins-extended database.SCOPe:蛋白质结构分类扩展数据库中大分子结构的分类。
Nucleic Acids Res. 2019 Jan 8;47(D1):D475-D481. doi: 10.1093/nar/gky1134.
3
Scuba: scalable kernel-based gene prioritization.
Scuba:可扩展的基于内核的基因优先级排序。
BMC Bioinformatics. 2018 Jan 25;19(1):23. doi: 10.1186/s12859-018-2025-5.
4
An ambiguity principle for assigning protein structural domains.一种用于分配蛋白质结构域的不明确性原理。
Sci Adv. 2017 Jan 13;3(1):e1600552. doi: 10.1126/sciadv.1600552. eCollection 2017 Jan.
5
CATH: an expanded resource to predict protein function through structure and sequence.CATH:一个通过结构和序列预测蛋白质功能的扩展资源。
Nucleic Acids Res. 2017 Jan 4;45(D1):D289-D295. doi: 10.1093/nar/gkw1098. Epub 2016 Nov 28.
6
Empirical power laws for the radii of gyration of protein oligomers.蛋白质低聚物转动半径的经验幂律。
Acta Crystallogr D Struct Biol. 2016 Oct 1;72(Pt 10):1119-1129. doi: 10.1107/S2059798316013218. Epub 2016 Sep 15.
7
A hybrid method for identification of structural domains.一种用于识别结构域的混合方法。
Sci Rep. 2014 Dec 15;4:7476. doi: 10.1038/srep07476.
8
Disease gene identification by using graph kernels and Markov random fields.利用图核和马尔可夫随机场进行疾病基因识别。
Sci China Life Sci. 2014 Nov;57(11):1054-63. doi: 10.1007/s11427-014-4745-8. Epub 2014 Oct 17.
9
ProDomAs, protein domain assignment algorithm using center-based clustering and independent dominating set.ProDomAs,一种使用基于中心的聚类和独立支配集的蛋白质结构域分配算法。
Proteins. 2014 Sep;82(9):1937-46. doi: 10.1002/prot.24547. Epub 2014 Mar 24.
10
Putracer: a novel method for identification of continuous-domains in multi-domain proteins.Putracer:一种鉴定多结构域蛋白质中连续结构域的新方法。
J Bioinform Comput Biol. 2013 Feb;11(1):1340012. doi: 10.1142/S021972001340012X.