Suppr超能文献

FFP:氨基酸特性感知系统发育分析中的联合快速傅里叶变换和分形维数。

FFP: joint Fast Fourier transform and fractal dimension in amino acid property-aware phylogenetic analysis.

机构信息

School of Computer, Electronics and Information, Guangxi University, Nanning, China.

Guangxi Normal University for Nationalities, Chongzuo, China.

出版信息

BMC Bioinformatics. 2022 Aug 19;23(1):347. doi: 10.1186/s12859-022-04889-3.

Abstract

BACKGROUND

Amino acid property-aware phylogenetic analysis (APPA) refers to the phylogenetic analysis method based on amino acid property encoding, which is used for understanding and inferring evolutionary relationships between species from the molecular perspective. Fast Fourier transform (FFT) and Higuchi's fractal dimension (HFD) have excellent performance in describing sequences' structural and complexity information for APPA. However, with the exponential growth of protein sequence data, it is very important to develop a reliable APPA method for protein sequence analysis.

RESULTS

Consequently, we propose a new method named FFP, it joints FFT and HFD. Firstly, FFP is used to encode protein sequences on the basis of the important physicochemical properties of amino acids, the dissociation constant, which determines acidity and basicity of protein molecules. Secondly, FFT and HFD are used to generate the feature vectors of encoded sequences, whereafter, the distance matrix is calculated from the cosine function, which describes the degree of similarity between species. The smaller the distance between them, the more similar they are. Finally, the phylogenetic tree is constructed. When FFP is tested for phylogenetic analysis on four groups of protein sequences, the results are obviously better than other comparisons, with the highest accuracy up to more than 97%.

CONCLUSION

FFP has higher accuracy in APPA and multi-sequence alignment. It also can measure the protein sequence similarity effectively. And it is hoped to play a role in APPA's related research.

摘要

背景

基于氨基酸性质的系统发育分析(APPA)是指基于氨基酸性质编码的系统发育分析方法,用于从分子角度理解和推断物种之间的进化关系。快速傅里叶变换(FFT)和 Higuchi 的分形维数(HFD)在描述序列的结构和复杂性信息方面具有出色的性能,适用于 APPA。然而,随着蛋白质序列数据的指数级增长,开发一种可靠的蛋白质序列分析 APPA 方法非常重要。

结果

因此,我们提出了一种名为 FFP 的新方法,它结合了 FFT 和 HFD。首先,FFP 基于氨基酸的重要物理化学性质——离解常数来对蛋白质序列进行编码,该常数决定了蛋白质分子的酸碱性。其次,FFT 和 HFD 用于生成编码序列的特征向量,然后,从余弦函数计算距离矩阵,该函数描述了物种之间的相似程度。它们之间的距离越小,相似度越高。最后,构建系统发育树。当 FFP 用于对四组蛋白质序列进行系统发育分析时,结果明显优于其他比较,准确率高达 97%以上。

结论

FFP 在 APPA 和多序列比对方面具有更高的准确性,能够有效测量蛋白质序列的相似性。希望它能在 APPA 的相关研究中发挥作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0a6/9392226/6ab821b62119/12859_2022_4889_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验