• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于光谱相似性评分(SSS)的详细蛋白质序列比对。

Detailed protein sequence alignment based on Spectral Similarity Score (SSS).

作者信息

Gupta Kshitiz, Thomas Dina, Vidya S V, Venkatesh K V, Ramakumar S

机构信息

Department of Computer Science & Engineering, Indian Institute of Technology, Bombay, Mumbai, India.

出版信息

BMC Bioinformatics. 2005 Apr 23;6:105. doi: 10.1186/1471-2105-6-105.

DOI:10.1186/1471-2105-6-105
PMID:15850477
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1131888/
Abstract

BACKGROUND

The chemical property and biological function of a protein is a direct consequence of its primary structure. Several algorithms have been developed which determine alignment and similarity of primary protein sequences. However, character based similarity cannot provide insight into the structural aspects of a protein. We present a method based on spectral similarity to compare subsequences of amino acids that behave similarly but are not aligned well by considering amino acids as mere characters. This approach finds a similarity score between sequences based on any given attribute, like hydrophobicity of amino acids, on the basis of spectral information after partial conversion to the frequency domain.

RESULTS

Distance matrices of various branches of the human kinome, that is the full complement of human kinases, were developed that matched the phylogenetic tree of the human kinome establishing the efficacy of the global alignment of the algorithm. PKCd and PKCe kinases share close biological properties and structural similarities but do not give high scores with character based alignments. Detailed comparison established close similarities between subsequences that do not have any significant character identity. We compared their known 3D structures to establish that the algorithm is able to pick subsequences that are not considered similar by character based matching algorithms but share structural similarities. Similarly many subsequences with low character identity were picked between xyna-theau and xyna-clotm F/10 xylanases. Comparison of 3D structures of the subsequences confirmed the claim of similarity in structure.

CONCLUSION

An algorithm is developed which is inspired by successful application of spectral similarity applied to music sequences. The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures. The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.

摘要

背景

蛋白质的化学性质和生物学功能是其一级结构的直接结果。已经开发了几种算法来确定蛋白质一级序列的比对和相似性。然而,基于字符的相似性无法深入了解蛋白质的结构方面。我们提出了一种基于光谱相似性的方法,通过将氨基酸视为单纯的字符,来比较行为相似但比对效果不佳的氨基酸子序列。这种方法基于部分转换到频域后的光谱信息,根据任何给定属性(如氨基酸的疏水性)在序列之间找到相似性得分。

结果

构建了人类激酶组各分支(即人类激酶的完整集合)的距离矩阵,该矩阵与人类激酶组的系统发育树相匹配,证明了该算法全局比对的有效性。PKCd和PKCe激酶具有密切的生物学特性和结构相似性,但基于字符的比对得分不高。详细比较发现,没有任何显著字符一致性的子序列之间存在密切相似性。我们比较了它们已知的三维结构,以确定该算法能够挑选出基于字符匹配算法认为不相似但具有结构相似性的子序列。同样,在xyna-theau和xyna-clotm F/10木聚糖酶之间也挑选出了许多字符一致性较低的子序列。子序列三维结构的比较证实了结构相似性的说法。

结论

开发了一种受光谱相似性成功应用于音乐序列启发的算法。该方法能够捕捉到传统基于字符的比对工具无法比对但能产生相似二级和三级结构的子序列。光谱相似性得分(SSS)是对传统相似性方法的扩展,结果表明它在分析各种生物序列和蛋白质结构变异方面具有强大的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/cf7a2d60c0b9/1471-2105-6-105-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/7446d6462ca8/1471-2105-6-105-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/ada6c80e3e4a/1471-2105-6-105-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/da07e6f07baa/1471-2105-6-105-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/b89ac9c0a882/1471-2105-6-105-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/c5576f4f06da/1471-2105-6-105-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/f6edd888dbf0/1471-2105-6-105-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/cf7a2d60c0b9/1471-2105-6-105-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/7446d6462ca8/1471-2105-6-105-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/ada6c80e3e4a/1471-2105-6-105-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/da07e6f07baa/1471-2105-6-105-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/b89ac9c0a882/1471-2105-6-105-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/c5576f4f06da/1471-2105-6-105-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/f6edd888dbf0/1471-2105-6-105-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3e2/1131888/cf7a2d60c0b9/1471-2105-6-105-7.jpg

相似文献

1
Detailed protein sequence alignment based on Spectral Similarity Score (SSS).基于光谱相似性评分(SSS)的详细蛋白质序列比对。
BMC Bioinformatics. 2005 Apr 23;6:105. doi: 10.1186/1471-2105-6-105.
2
NdPASA: a novel pairwise protein sequence alignment algorithm that incorporates neighbor-dependent amino acid propensities.NdPASA:一种整合了邻域依赖氨基酸倾向的新型双序列蛋白质序列比对算法。
Proteins. 2005 Feb 15;58(3):628-37. doi: 10.1002/prot.20359.
3
A spectral approach to protein structure alignment.一种基于光谱的蛋白质结构比对方法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):867-75. doi: 10.1109/TCBB.2011.24.
4
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.一种蛋白质序列与结构分析及建模的综合方法。III. 使用多重结构比对对蛋白质结构家族中的序列保守性进行比较研究。
J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975.
5
A comparison of position-specific score matrices based on sequence and structure alignments.基于序列和结构比对的特定位置得分矩阵比较。
Protein Sci. 2002 Feb;11(2):361-70. doi: 10.1110/ps.19902.
6
DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment.DIALIGN-T:一种改进的基于片段的多序列比对算法。
BMC Bioinformatics. 2005 Mar 22;6:66. doi: 10.1186/1471-2105-6-66.
7
On distance and similarity in fold space.关于折叠空间中的距离和相似性。
Bioinformatics. 2008 Mar 15;24(6):872-3. doi: 10.1093/bioinformatics/btn040. Epub 2008 Jan 28.
8
Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score.Fr-TM-align:一种基于片段比对和TM分数的新型蛋白质结构比对方法。
BMC Bioinformatics. 2008 Dec 12;9:531. doi: 10.1186/1471-2105-9-531.
9
An integrated approach to the analysis and modeling of protein sequences and structures. II. On the relationship between sequence and structural similarity for proteins that are not obviously related in sequence.蛋白质序列与结构分析及建模的综合方法。II. 关于序列无明显关联的蛋白质的序列与结构相似性之间的关系。
J Mol Biol. 2000 Aug 18;301(3):679-89. doi: 10.1006/jmbi.2000.3974.
10
DALIX: optimal DALI protein structure alignment.DALIX:最佳 DALI 蛋白结构比对。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):26-36. doi: 10.1109/TCBB.2012.143.

引用本文的文献

1
pCold-assisted expression of a thermostable xylanase from : cloning, expression and characterization.来自[具体来源未给出]的耐热木聚糖酶的冷激诱导表达:克隆、表达及特性分析
3 Biotech. 2022 Oct;12(10):245. doi: 10.1007/s13205-022-03315-y. Epub 2022 Aug 25.
2
FFP: joint Fast Fourier transform and fractal dimension in amino acid property-aware phylogenetic analysis.FFP:氨基酸特性感知系统发育分析中的联合快速傅里叶变换和分形维数。
BMC Bioinformatics. 2022 Aug 19;23(1):347. doi: 10.1186/s12859-022-04889-3.
3
Favourable Interfacial Characteristics of A2 Milk Protein Monolayer.

本文引用的文献

1
Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model.通过隐马尔可夫模型的蒙特卡罗优化实现蛋白质序列基序的间隙比对。
BMC Bioinformatics. 2004 Oct 25;5:157. doi: 10.1186/1471-2105-5-157.
2
BLAST: at the core of a powerful and diverse set of sequence analysis tools.BLAST:一系列强大且多样的序列分析工具的核心。
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W20-5. doi: 10.1093/nar/gkh435.
3
Announcing the worldwide Protein Data Bank.宣布全球蛋白质数据库。
A2 牛奶蛋白单层的有利界面特性。
J Membr Biol. 2023 Feb;256(1):35-41. doi: 10.1007/s00232-022-00248-8. Epub 2022 Jun 20.
4
Study of human allergic milk whey protein from different mammalian species using computational method.利用计算方法对不同哺乳动物物种的人源过敏性乳清蛋白进行研究。
Bioinformation. 2012;8(21):1035-41. doi: 10.6026/97320630081035. Epub 2012 Oct 31.
5
Analysis of casein alpha S1 & S2 proteins from different mammalian species.不同哺乳动物物种的αS1和αS2酪蛋白的分析。
Bioinformation. 2010 Mar 31;4(9):430-5. doi: 10.6026/97320630004430.
6
A method for probabilistic mapping between protein structure and function taxonomies through cross training.一种通过交叉训练在蛋白质结构与功能分类法之间进行概率映射的方法。
BMC Struct Biol. 2008 Oct 3;8:40. doi: 10.1186/1472-6807-8-40.
Nat Struct Biol. 2003 Dec;10(12):980. doi: 10.1038/nsb1203-980.
4
The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.2003年的SWISS-PROT蛋白质知识库及其补充TrEMBL。
Nucleic Acids Res. 2003 Jan 1;31(1):365-70. doi: 10.1093/nar/gkg095.
5
A novel approach to the recognition of protein architecture from sequence using Fourier analysis and neural networks.一种利用傅里叶分析和神经网络从序列识别蛋白质结构的新方法。
Proteins. 2003 Feb 1;50(2):290-302. doi: 10.1002/prot.10290.
6
The protein kinase complement of the human genome.人类基因组的蛋白激酶补体。
Science. 2002 Dec 6;298(5600):1912-34. doi: 10.1126/science.1075762.
7
Homology induction: the use of machine learning to improve sequence similarity searches.同源性诱导:利用机器学习改进序列相似性搜索。
BMC Bioinformatics. 2002 Apr 23;3:11. doi: 10.1186/1471-2105-3-11.
8
Protein sequence comparison based on the wavelet transform approach.基于小波变换方法的蛋白质序列比较。
Protein Eng. 2002 Mar;15(3):193-203. doi: 10.1093/protein/15.3.193.
9
Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison.基于Cα-Cα距离比较的概率方法估计的蛋白质折叠相似性。
J Mol Biol. 2002 Jan 25;315(4):887-98. doi: 10.1006/jmbi.2001.5250.
10
T-Coffee: A novel method for fast and accurate multiple sequence alignment.T-Coffee:一种用于快速准确的多序列比对的新方法。
J Mol Biol. 2000 Sep 8;302(1):205-17. doi: 10.1006/jmbi.2000.4042.