• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于新型组代表向量的蛋白质序列统计相似/相异分析。

A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector.

机构信息

Department of Engineering Mathematics and Physics, Faculty of Engineering, Mansoura University, Mansoura 35516, Egypt.

出版信息

Biomed Res Int. 2019 May 8;2019:8702968. doi: 10.1155/2019/8702968. eCollection 2019.

DOI:10.1155/2019/8702968
PMID:31205946
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6530227/
Abstract

Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the present alignment-free methods approve the utility of their approaches by producing a similarity/dissimilarity matrix. Although this matrix is clear, it measures the degree of similarity among sequences individually. In our work, a representative of each of three groups of protein sequences is introduced. A similarity/dissimilarity vector is evaluated instead of the ordinary similarity/dissimilarity matrix based on the group representative. The approach is applied on three selected groups of protein sequences: beta globin, NADH dehydrogenase subunit 5 (ND5), and spike protein sequences. A cross-grouping comparison is produced to ensure the singularity of each group. A qualitative comparison between our approach, previous articles, and the phylogenetic tree of these protein sequences proved the utility of our approach.

摘要

相似性/相异性分析是通过了解新基因/序列的起源来理解生物生物学的一种关键方法。序列数据根据生物关系进行分组。与任何组相关的序列数量每天都有可能增加。所有现有的无比对方法都通过生成相似性/相异性矩阵来证明其方法的实用性。虽然这个矩阵很清晰,但它单独测量了序列之间的相似程度。在我们的工作中,引入了每组蛋白质序列的代表。基于组代表,评估了相似性/相异性向量,而不是普通的相似性/相异性矩阵。该方法应用于三组选定的蛋白质序列:β球蛋白、NADH 脱氢酶亚基 5(ND5)和刺突蛋白序列。进行了跨组比较,以确保每个组的独特性。我们的方法、以前的文章以及这些蛋白质序列的系统发育树之间的定性比较证明了我们方法的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/c28fac28baaa/BMRI2019-8702968.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/22649f0f9114/BMRI2019-8702968.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/57d3fbe50ce7/BMRI2019-8702968.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/65d0440a5951/BMRI2019-8702968.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/8667250efd2c/BMRI2019-8702968.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/9a9ff73fb27c/BMRI2019-8702968.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/7fb913f7b843/BMRI2019-8702968.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/619d7ee5eb50/BMRI2019-8702968.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/d9119e82231b/BMRI2019-8702968.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/768a2f80d4c0/BMRI2019-8702968.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/c28fac28baaa/BMRI2019-8702968.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/22649f0f9114/BMRI2019-8702968.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/57d3fbe50ce7/BMRI2019-8702968.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/65d0440a5951/BMRI2019-8702968.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/8667250efd2c/BMRI2019-8702968.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/9a9ff73fb27c/BMRI2019-8702968.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/7fb913f7b843/BMRI2019-8702968.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/619d7ee5eb50/BMRI2019-8702968.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/d9119e82231b/BMRI2019-8702968.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/768a2f80d4c0/BMRI2019-8702968.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8853/6530227/c28fac28baaa/BMRI2019-8702968.010.jpg

相似文献

1
A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector.基于新型组代表向量的蛋白质序列统计相似/相异分析。
Biomed Res Int. 2019 May 8;2019:8702968. doi: 10.1155/2019/8702968. eCollection 2019.
2
Measuring Similarity among Protein Sequences Using a New Descriptor.使用新描述符衡量蛋白质序列之间的相似性。
Biomed Res Int. 2019 Nov 22;2019:2796971. doi: 10.1155/2019/2796971. eCollection 2019.
3
Identifying anticancer peptides by using a generalized chaos game representation.利用广义混沌博弈表示法鉴定抗癌肽
J Math Biol. 2019 Jan;78(1-2):441-463. doi: 10.1007/s00285-018-1279-x. Epub 2018 Oct 5.
4
Non-standard similarity/dissimilarity analysis of DNA sequences.DNA序列的非标准相似性/相异性分析。
Genomics. 2014 Dec;104(6 Pt B):464-71. doi: 10.1016/j.ygeno.2014.08.010. Epub 2014 Aug 28.
5
Analysis of similarity/dissimilarity of DNA sequences based on convolutional code model.基于卷积码模型的DNA序列相似性/差异性分析
Nucleosides Nucleotides Nucleic Acids. 2010 Feb;29(2):123-31. doi: 10.1080/15257771003597766.
6
Alignment-free similarity analysis for protein sequences based on fuzzy integral.基于模糊积分的蛋白质序列无对齐相似性分析。
Sci Rep. 2019 Feb 26;9(1):2775. doi: 10.1038/s41598-019-39477-8.
7
Phylogenetic and structural analysis of mitochondrial complex I proteins.线粒体复合体I蛋白质的系统发育和结构分析
Gene. 2005 Jan 17;345(1):55-64. doi: 10.1016/j.gene.2004.11.033. Epub 2004 Dec 29.
8
Mapping sequence to feature vector using numerical representation of codons targeted to amino acids for alignment-free sequence analysis.使用针对氨基酸的密码子的数值表示将序列映射到特征向量,用于无比对序列分析。
Gene. 2021 Jan 15;766:145096. doi: 10.1016/j.gene.2020.145096. Epub 2020 Sep 9.
9
A Generalized Iterative Map for Analysis of Protein Sequences.一种用于分析蛋白质序列的广义迭代映射。
Comb Chem High Throughput Screen. 2022;25(3):381-391. doi: 10.2174/1386207323666201012142318.
10
[Phylogenetic relationships among Cobitoidea based on mitochondrial ND4 and ND5 gene sequences].基于线粒体ND4和ND5基因序列的鳅超科鱼类系统发育关系
Dongwuxue Yanjiu. 2010 Jun;31(3):221-9. doi: 10.3724/SP.J.1141.2010.03221.

引用本文的文献

1
An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids.一种基于氨基酸理化性质的精确无对齐蛋白质序列比较器。
Sci Rep. 2022 Jul 1;12(1):11158. doi: 10.1038/s41598-022-15266-8.

本文引用的文献

1
Similarity/dissimilarity calculation methods of DNA sequences: A survey.DNA序列的相似性/相异性计算方法:综述
J Mol Graph Model. 2017 Sep;76:342-355. doi: 10.1016/j.jmgm.2017.07.019. Epub 2017 Jul 20.
2
A new method to analyze protein sequence similarity using Dynamic Time Warping.一种使用动态时间规整分析蛋白质序列相似性的新方法。
Genomics. 2017 Mar;109(2):123-130. doi: 10.1016/j.ygeno.2016.12.002. Epub 2016 Dec 11.
3
Graphical Representation and Similarity Analysis of Protein Sequences Based on Fractal Interpolation.
基于分形插值的蛋白质序列图形表示与相似性分析
IEEE/ACM Trans Comput Biol Bioinform. 2017 Jan-Feb;14(1):182-192. doi: 10.1109/TCBB.2015.2511731. Epub 2015 Dec 29.
4
Novel numerical characterization of protein sequences based on individual amino acid and its application.基于单个氨基酸的蛋白质序列新型数值表征及其应用
Biomed Res Int. 2015;2015:909567. doi: 10.1155/2015/909567. Epub 2015 Feb 2.
5
ADLD: a novel graphical representation of protein sequences and its application.ADLD:一种蛋白质序列的新型图形表示及其应用
Comput Math Methods Med. 2014;2014:959753. doi: 10.1155/2014/959753. Epub 2014 Oct 30.
6
Linear regression model of short k-word: a similarity distance suitable for biological sequences with various lengths.短 k-字线性回归模型:一种适用于各种长度生物序列的相似性距离。
J Theor Biol. 2013 Nov 21;337:61-70. doi: 10.1016/j.jtbi.2013.07.028. Epub 2013 Aug 8.
7
Sequence comparison via polar coordinates representation and curve tree.基于极坐标表示和曲线树的序列比较。
J Theor Biol. 2012 Jan 7;292:78-85. doi: 10.1016/j.jtbi.2011.09.030. Epub 2011 Oct 6.
8
Graphical representation of proteins.蛋白质的图形表示。
Chem Rev. 2011 Feb 9;111(2):790-862. doi: 10.1021/cr800198j. Epub 2010 Oct 12.
9
The graphical representation of protein sequences based on the physicochemical properties and its applications.基于理化性质的蛋白质序列图形表示及其应用。
J Comput Chem. 2010 Aug;31(11):2136-42. doi: 10.1002/jcc.21501.
10
On novel representation of proteins based on amino acid adjacency matrix.基于氨基酸邻接矩阵的蛋白质新表示法。
SAR QSAR Environ Res. 2008 Apr-Jun;19(3-4):339-49. doi: 10.1080/10629360802085082.