• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于物理化学性质的蛋白质序列比较的数学方法。

Mathematical Approach to Protein Sequence Comparison Based on Physiochemical Properties.

作者信息

Pal Jayanta, Ghosh Soumen, Maji Bansibadan, Bhattacharya Dilip Kumar

机构信息

Department of ECE, National Institute of Technology, Durgapur 713209, India.

Department of CSE, Narula Institute of Technology, Kolkata 700109, India.

出版信息

ACS Omega. 2022 Oct 17;7(43):39446-39455. doi: 10.1021/acsomega.2c06103. eCollection 2022 Nov 1.

DOI:10.1021/acsomega.2c06103
PMID:36340165
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9631895/
Abstract

The difficult aspect of developing new protein sequence comparison techniques is coming up with a method that can quickly and effectively handle huge data sets of various lengths in a timely manner. In this work, we first obtain two numerical representations of protein sequences separately based on one physical property and one chemical property of amino acids. The lengths of all the sequences under comparison are made equal by appending the required number of zeroes. Then, fast Fourier transform is applied to this numerical time series to obtain the corresponding spectrum. Next, the spectrum values are reduced by the standard inter coefficient difference method. Finally, the corresponding normalized values of the reduced spectrum are selected as the descriptors for protein sequence comparison. Using these descriptors, the distance matrices are obtained using Euclidian distance. They are subsequently used to draw the phylogenetic trees using the UPGMA algorithm. Phylogenetic trees are first constructed for 9 ND4, 9 ND5, and 9 ND6 proteins using the polarity value as the chemical property and the molecular weight as the physical property. They are compared, and it is seen that polarity is a better choice than molecular weight in protein sequence comparison. Next, using the polarity property, phylogenetic trees are obtained for 12 baculovirus and 24 transferrin proteins. The results are compared with those obtained earlier on the identical sequences by other methods. Three assessment criteria are considered for comparison of the results-quality based on rationalized perception, quantitative measures based on symmetric distance, and computational speed. In all the cases, the results are found to be more satisfactory.

摘要

开发新的蛋白质序列比较技术的难点在于想出一种能够及时快速且有效地处理各种长度的海量数据集的方法。在这项工作中,我们首先基于氨基酸的一种物理性质和一种化学性质分别获得蛋白质序列的两种数值表示。通过添加所需数量的零使所有待比较序列的长度相等。然后,对这个数值时间序列应用快速傅里叶变换以获得相应的频谱。接下来,通过标准的互相关系数差方法降低频谱值。最后,选择降低后的频谱的相应归一化值作为蛋白质序列比较的描述符。使用这些描述符,利用欧几里得距离获得距离矩阵。随后使用UPGMA算法根据这些距离矩阵绘制系统发育树。首先以极性值作为化学性质、分子量作为物理性质构建9种ND4、9种ND5和9种ND6蛋白质的系统发育树。对它们进行比较,可以看出在蛋白质序列比较中极性比分子量是更好的选择。接下来,利用极性性质获得12种杆状病毒和24种转铁蛋白的系统发育树。将结果与通过其他方法在相同序列上早期获得的结果进行比较。考虑三个评估标准来比较结果——基于合理认知的质量、基于对称距离的定量度量以及计算速度。在所有情况下,结果都更令人满意。

相似文献

1
Mathematical Approach to Protein Sequence Comparison Based on Physiochemical Properties.基于物理化学性质的蛋白质序列比较的数学方法。
ACS Omega. 2022 Oct 17;7(43):39446-39455. doi: 10.1021/acsomega.2c06103. eCollection 2022 Nov 1.
2
Protein sequence comparison based on representation on a finite dimensional unit hypercube.基于有限维单位超正方体表示的蛋白质序列比较。
J Biomol Struct Dyn. 2024 Aug;42(12):6425-6439. doi: 10.1080/07391102.2023.2268719. Epub 2023 Oct 14.
3
MMV method: a new approach to compare protein sequences under binary representation.MMV方法:一种在二进制表示下比较蛋白质序列的新方法。
J Biomol Struct Dyn. 2025 Aug;43(13):6563-6569. doi: 10.1080/07391102.2024.2317982. Epub 2024 Feb 20.
4
Use of 2D FFT and DTW in Protein Sequence Comparison.二维傅里叶变换和 DTW 在蛋白质序列比较中的应用。
Protein J. 2024 Feb;43(1):1-11. doi: 10.1007/s10930-023-10160-2. Epub 2023 Oct 17.
5
PTGAC Model: A machine learning approach for constructing phylogenetic tree to compare protein sequences.PTGAC模型:一种用于构建系统发育树以比较蛋白质序列的机器学习方法。
J Bioinform Comput Biol. 2023 Feb;21(1):2250028. doi: 10.1142/S0219720022500287. Epub 2023 Feb 10.
6
FFP: joint Fast Fourier transform and fractal dimension in amino acid property-aware phylogenetic analysis.FFP:氨基酸特性感知系统发育分析中的联合快速傅里叶变换和分形维数。
BMC Bioinformatics. 2022 Aug 19;23(1):347. doi: 10.1186/s12859-022-04889-3.
7
An improved model for whole genome phylogenetic analysis by Fourier transform.一种通过傅里叶变换进行全基因组系统发育分析的改进模型。
J Theor Biol. 2015 Oct 7;382:99-110. doi: 10.1016/j.jtbi.2015.06.033. Epub 2015 Jul 4.
8
Normalized feature vectors: a novel alignment-free sequence comparison method based on the numbers of adjacent amino acids.标准化特征向量:一种新颖的基于相邻氨基酸数量的无比对序列比较方法。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):457-67. doi: 10.1109/TCBB.2013.10.
9
An efficient numerical method for protein sequences similarity analysis based on a new two-dimensional graphical representation.一种基于新的二维图形表示的蛋白质序列相似性分析高效数值方法。
SAR QSAR Environ Res. 2015;26(2):125-37. doi: 10.1080/1062936X.2014.995700.
10
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.

引用本文的文献

1
Use of 2D FFT and DTW in Protein Sequence Comparison.二维傅里叶变换和 DTW 在蛋白质序列比较中的应用。
Protein J. 2024 Feb;43(1):1-11. doi: 10.1007/s10930-023-10160-2. Epub 2023 Oct 17.

本文引用的文献

1
MEGA11: Molecular Evolutionary Genetics Analysis Version 11.MEGA11:分子进化遗传学分析版本 11。
Mol Biol Evol. 2021 Jun 25;38(7):3022-3027. doi: 10.1093/molbev/msab120.
2
In Silico Rational Design and Virtual Screening of Bioactive Peptides Based on QSAR Modeling.基于定量构效关系建模的生物活性肽的计算机辅助合理设计与虚拟筛选
ACS Omega. 2020 Mar 10;5(11):5951-5958. doi: 10.1021/acsomega.9b04302. eCollection 2020 Mar 24.
3
Alignment-free sequence comparison: benefits, applications, and tools.无比对信息的序列比对:优势、应用和工具。
Genome Biol. 2017 Oct 3;18(1):186. doi: 10.1186/s13059-017-1319-7.
4
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.基于无比对的方法推断系统发生的分支和网状结构关系。
Brief Bioinform. 2019 Mar 22;20(2):426-435. doi: 10.1093/bib/bbx067.
5
Protein Sequence Comparison Based on Physicochemical Properties and the Position-Feature Energy Matrix.基于理化性质和位置特征能量矩阵的蛋白质序列比较。
Sci Rep. 2017 Apr 10;7:46237. doi: 10.1038/srep46237.
6
An efficient method for measuring the similarity of protein sequences.一种测量蛋白质序列相似性的有效方法。
SAR QSAR Environ Res. 2016 May;27(5):363-70. doi: 10.1080/1062936X.2016.1174735. Epub 2016 Apr 22.
7
A new method to cluster DNA sequences using Fourier power spectrum.一种使用傅里叶功率谱对DNA序列进行聚类的新方法。
J Theor Biol. 2015 May 7;372:135-45. doi: 10.1016/j.jtbi.2015.02.026. Epub 2015 Mar 5.
8
A protein mapping method based on physicochemical properties and dimension reduction.基于理化性质和降维的蛋白质图谱绘制方法。
Comput Biol Med. 2015 Feb;57:1-7. doi: 10.1016/j.compbiomed.2014.11.012. Epub 2014 Nov 28.
9
Similarity/Dissimilarity analysis of protein sequences based on a new spectrum-like graphical representation.基于新的类光谱图形表示的蛋白质序列相似性/差异性分析。
Evol Bioinform Online. 2014 Jun 12;10:87-96. doi: 10.4137/EBO.S14713. eCollection 2014.
10
Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity.离散傅里叶系数间差异在评估基因序列相似性中的应用。
EURASIP J Bioinform Syst Biol. 2014;2014(1):8. doi: 10.1186/1687-4153-2014-8. Epub 2014 May 28.