Suppr超能文献

一种基于序列衍生特性识别气味结合蛋白的机器学习方法。

A machine learning approach for the identification of odorant binding proteins from sequence-derived properties.

作者信息

Pugalenthi Ganesan, Tang Ke, Suganthan P N, Archunan G, Sowdhamini R

机构信息

School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore.

出版信息

BMC Bioinformatics. 2007 Sep 19;8:351. doi: 10.1186/1471-2105-8-351.

Abstract

BACKGROUND

Odorant binding proteins (OBPs) are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins.

RESULTS

In this paper, we propose a new algorithm that uses Regularized Least Squares Classifier (RLSC) in conjunction with multiple physicochemical properties of amino acids to predict odorant-binding proteins. The algorithm was applied to the dataset derived from Pfam and GenDiS database and we obtained overall prediction accuracy of 97.7% (94.5% and 98.4% for positive and negative classes respectively).

CONCLUSION

Our study suggests that RLSC is potentially useful for predicting the odorant binding proteins from sequence-derived properties irrespective of sequence similarity. Our method predicts 92.8% of 56 odorant binding proteins non-homologous to any protein in the swissprot database and 97.1% of the 414 independent dataset proteins, suggesting the usefulness of RLSC method for facilitating the prediction of odorant binding proteins from sequence information.

摘要

背景

气味结合蛋白(OBPs)被认为可将环境中的气味分子转运至其下方的气味受体,它们可能作为气味分子呈现者发挥作用。尽管已采用多种基于序列的搜索方法进行蛋白质家族预测,但从序列数据预测OBPs的工作做得较少,而且由于这些蛋白质之间的序列同一性较差,该领域更具挑战性。

结果

在本文中,我们提出了一种新算法,该算法结合氨基酸的多种物理化学性质,使用正则化最小二乘分类器(RLSC)来预测气味结合蛋白。该算法应用于源自Pfam和GenDiS数据库的数据集,我们获得的总体预测准确率为97.7%(阳性和阴性类别分别为94.5%和98.4%)。

结论

我们的研究表明,无论序列相似性如何,RLSC对于从序列衍生特性预测气味结合蛋白可能是有用的。我们的方法预测了56种与swissprot数据库中任何蛋白质均无同源性的气味结合蛋白中的92.8%,以及414个独立数据集蛋白质中的97.1%,这表明RLSC方法对于从序列信息促进气味结合蛋白的预测是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5bb/2216042/e289b1353527/1471-2105-8-351-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验