Suppr超能文献

多图谱正则化蛋白质域排序。

Multiple graph regularized protein domain ranking.

机构信息

Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.

出版信息

BMC Bioinformatics. 2012 Nov 19;13:307. doi: 10.1186/1471-2105-13-307.

Abstract

BACKGROUND

Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.

RESULTS

To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.

CONCLUSION

The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

摘要

背景

蛋白质结构域排序是结构生物学中的一项基本任务。大多数蛋白质结构域排序方法依赖于蛋白质结构域的两两比较,而忽略了蛋白质结构域数据库的全局流形结构。最近,利用由两两相似性定义的图的全局结构的图正则化排序已经被提出。然而,现有的图正则化排序方法对图模型和参数的选择非常敏感,这仍然是大多数蛋白质结构域排序方法的一个难题。

结果

为了解决这个问题,我们开发了多图正则化排序算法 MultiG-Rank。MultiG-Rank 不是使用单个图来正则化排序分数,而是通过结合多个初始图来进行正则化,从而近似蛋白质结构域分布的内在流形。图权重通过交替最小化迭代算法中的目标函数,与排序分数一起被联合自动学习。在 ASTRAL SCOP 蛋白质结构域数据库的一个子集上的实验结果表明,MultiG-Rank 比单图正则化排序方法和基于两两相似性的排序方法具有更好的排序性能。

结论

通过结合多个图,可以有效地解决图正则化蛋白质结构域排序中的图模型和参数选择问题。这种泛化方面为将多个图应用于解决蛋白质结构域排序问题引入了一个新的前沿。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc86/3583823/2d1ab8ab921d/1471-2105-13-307-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验