Suppr超能文献

结合多种聚类方法进行蛋白质结构预测。

Combining multiple clusterings for protein structure prediction.

作者信息

Sakar C Okan, Kursun Olcay, Seker Huseyin, Gurgen Fikret

出版信息

Int J Data Min Bioinform. 2014;10(2):162-74. doi: 10.1504/ijdmb.2014.064012.

Abstract

Computational annotation and prediction of protein structure is very important in the post-genome era due to existence of many different proteins, most of which are yet to be verified. Mutual information based feature selection methods can be used in selecting such minimal yet predictive subsets of features. However, as protein features are organised into natural partitions, individual feature selection that ignores the presence of these views, dismantles them, and treats their variables intermixed along with those of others at best results in a complex un-interpretable predictive system for such multi-view datasets. In this paper, instead of selecting a subset of individual features, each feature subset is passed through a clustering step so that it is represented in discrete form using the cluster indices; this makes mutual information based methods applicable to view-selection. We present our experimental results on a multi-view protein dataset that are used to predict protein structure.

摘要

在后基因组时代,由于存在众多不同的蛋白质,且其中大多数尚未得到验证,蛋白质结构的计算注释和预测非常重要。基于互信息的特征选择方法可用于选择此类最小但具有预测性的特征子集。然而,由于蛋白质特征被组织成自然分区,忽略这些视图存在的单个特征选择会拆解它们,并将其变量与其他变量混合处理,这充其量会为这类多视图数据集产生一个复杂且难以解释的预测系统。在本文中,不是选择单个特征的子集,而是将每个特征子集经过聚类步骤,以便使用聚类索引以离散形式表示;这使得基于互信息的方法适用于视图选择。我们展示了在用于预测蛋白质结构的多视图蛋白质数据集上的实验结果。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验