Suppr超能文献

使用稀疏马尔可夫变换器进行蛋白质家族分类。

Protein family classification using sparse Markov transducers.

作者信息

Eskin E, Grundy W N, Singer Y

机构信息

Department of Computer Science, Columbia University, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 2000;8:134-45.

Abstract

In this paper we present a method for classifying proteins into families using sparse Markov transducers (SMTs). Sparse Markov transducers, similar to probabilistic suffix trees, estimate a probability distribution conditioned on an input sequence. SMTs generalize probabilistic suffix trees by allowing for wild-cards in the conditioning sequences. Because substitutions of amino acids are common in protein families, incorporating wildcards into the model significantly improves classification performance. We present two models for building protein family classifiers using SMTs. We also present efficient data structures to improve the memory usage of the models. We evaluate SMTs by building protein family classifiers using the Pfam database and compare our results to previously published results.

摘要

在本文中,我们提出了一种使用稀疏马尔可夫变换器(SMT)将蛋白质分类到家族中的方法。稀疏马尔可夫变换器与概率后缀树类似,可根据输入序列估计概率分布。SMT通过在条件序列中允许通配符来推广概率后缀树。由于氨基酸替换在蛋白质家族中很常见,因此将通配符纳入模型可显著提高分类性能。我们提出了两种使用SMT构建蛋白质家族分类器的模型。我们还提出了高效的数据结构以改善模型的内存使用情况。我们通过使用Pfam数据库构建蛋白质家族分类器来评估SMT,并将我们的结果与先前发表的结果进行比较。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验