Suppr超能文献

萨米拉-可变位置比对法:一种通过重新检查字母向量位置的简单蛋白质比对方法。

Samira-VP: A simple protein alignment method with rechecking the alphabet vector positions.

作者信息

Fotoohifiroozabadi Samira, Mohamad Mohd Saberi, Deris Safaai

机构信息

1 Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, Skudai 81310 Johor, Malaysia.

2 Faculty of Creative Technology & Heritage, Universiti Malaysia Kelantan, Locked Bag 01, 16300 Bachok, Kota Bharu, Kelantan, Malaysia.

出版信息

J Bioinform Comput Biol. 2017 Apr;15(2):1750004. doi: 10.1142/S0219720017500044. Epub 2017 Jan 26.

Abstract

Protein structure alignment and comparisons that are based on an alphabetical demonstration of protein structure are more simple to run with faster evaluation processes; thus, their accuracy is not as reliable as three-dimension (3D)-based tools. As a 1D method candidate, TS-AMIR used the alphabetic demonstration of secondary-structure elements (SSE) of proteins and compared the assigned letters to each SSE using the [Formula: see text]-gram method. Although the results were comparable to those obtained via geometrical methods, the SSE length and accuracy of adjacency between SSEs were not considered in the comparison process. Therefore, to obtain further information on accuracy of adjacency between SSE vectors, the new approach of assigning text to vectors was adopted according to the spherical coordinate system in the present study. Moreover, dynamic programming was applied in order to account for the length of SSE vectors. Five common datasets were selected for method evaluation. The first three datasets were small, but difficult to align, and the remaining two datasets were used to compare the capability of the proposed method with that of other methods on a large protein dataset. The results showed that the proposed method, as a text-based alignment approach, obtained results comparable to both 1D and 3D methods. It outperformed 1D methods in terms of accuracy and 3D methods in terms of runtime.

摘要

基于蛋白质结构字母表示法的蛋白质结构比对和比较运行起来更简单,评估过程更快;因此,它们的准确性不如基于三维(3D)的工具可靠。作为一维方法的候选者,TS-AMIR使用蛋白质二级结构元件(SSE)的字母表示法,并使用[公式:见正文]-gram方法将分配的字母与每个SSE进行比较。尽管结果与通过几何方法获得的结果相当,但在比较过程中未考虑SSE长度和SSE之间邻接的准确性。因此,为了获得关于SSE向量之间邻接准确性的更多信息,本研究根据球坐标系采用了将文本分配给向量的新方法。此外,应用动态规划以考虑SSE向量的长度。选择了五个常见数据集进行方法评估。前三个数据集较小,但难以比对,其余两个数据集用于在大型蛋白质数据集上比较所提出方法与其他方法的能力。结果表明,所提出的方法作为基于文本的比对方法,获得了与一维和三维方法相当的结果。它在准确性方面优于一维方法,在运行时间方面优于三维方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验