Suppr超能文献

基于多核学习和差异表示的蛋白质接触网络建模与识别

Modelling and Recognition of Protein Contact Networks by Multiple Kernel Learning and Dissimilarity Representations.

作者信息

Martino Alessio, De Santis Enrico, Giuliani Alessandro, Rizzi Antonello

机构信息

Department of Information Engineering, Electronics and Telecommunications, University of Rome "La Sapienza", Via Eudossiana 18, 00184 Rome, Italy.

Department of Environment and Health, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161 Rome, Italy.

出版信息

Entropy (Basel). 2020 Jul 21;22(7):794. doi: 10.3390/e22070794.

Abstract

Multiple kernel learning is a paradigm which employs a properly constructed chain of kernel functions able to simultaneously analyse different data or different representations of the same data. In this paper, we propose an hybrid classification system based on a linear combination of multiple kernels defined over multiple dissimilarity spaces. The core of the training procedure is the joint optimisation of kernel weights and representatives selection in the dissimilarity spaces. This equips the system with a two-fold knowledge discovery phase: by analysing the weights, it is possible to check which representations are more suitable for solving the classification problem, whereas the pivotal patterns selected as representatives can give further insights on the modelled system, possibly with the help of field-experts. The proposed classification system is tested on real proteomic data in order to predict proteins' functional role starting from their folded structure: specifically, a set of eight representations are drawn from the graph-based protein folded description. The proposed multiple kernel-based system has also been benchmarked against a clustering-based classification system also able to exploit multiple dissimilarities simultaneously. Computational results show remarkable classification capabilities and the knowledge discovery analysis is in line with current biological knowledge, suggesting the reliability of the proposed system.

摘要

多核学习是一种范式,它采用适当构建的核函数链,能够同时分析不同的数据或同一数据的不同表示形式。在本文中,我们提出了一种基于在多个差异空间上定义的多个核的线性组合的混合分类系统。训练过程的核心是核权重的联合优化和差异空间中代表的选择。这为系统配备了一个双重知识发现阶段:通过分析权重,可以检查哪些表示形式更适合解决分类问题,而作为代表选择的关键模式可以在领域专家的帮助下,对建模系统提供进一步的见解。所提出的分类系统在真实蛋白质组学数据上进行了测试,以便从蛋白质的折叠结构预测其功能作用:具体而言,从基于图的蛋白质折叠描述中提取了一组八种表示形式。所提出的基于多核的系统也与同样能够同时利用多种差异的基于聚类的分类系统进行了基准测试。计算结果显示出显著的分类能力,并且知识发现分析与当前生物学知识一致,表明所提出系统的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4fc3/7517365/3d43e0cb52d4/entropy-22-00794-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验