• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于新型 k-mer 自然向量方法的蛋白质序列系统发育分析。

Phylogenetic analysis of protein sequences based on a novel k-mer natural vector method.

机构信息

School of Agriculture and hydraulic Engineering, Suihua University, Suihua 152061, China.

School of Information Engineering, Suihua University, Suihua 152061, China.

出版信息

Genomics. 2019 Dec;111(6):1298-1305. doi: 10.1016/j.ygeno.2018.08.010. Epub 2018 Sep 5.

DOI:10.1016/j.ygeno.2018.08.010
PMID:30195069
Abstract

Based on the k-mer model for protein sequence, a novel k-mer natural vector method is proposed to characterize the features of k-mers in a protein sequence, in which the numbers and distributions of k-mers are considered. It is proved that the relationship between a protein sequence and its k-mer natural vector is one-to-one. Phylogenetic analysis of protein sequences therefore can be easily performed without requiring evolutionary models or human intervention. In addition, there exists no a criterion to choose a suitable k, and k has a great influence on obtaining results as well as computational complexity. In this paper, a compound k-mer natural vector is utilized to quantify each protein sequence. The results gotten from phylogenetic analysis on three protein datasets demonstrate that our new method can precisely describe the evolutionary relationships of proteins, and greatly heighten the computing efficiency.

摘要

基于蛋白质序列的 k-mer 模型,提出了一种新的 k-mer 自然向量方法,用于描述蛋白质序列中 k-mer 的特征,其中考虑了 k-mer 的数量和分布。证明了蛋白质序列与其 k-mer 自然向量之间的关系是一一对应的。因此,无需进化模型或人为干预即可轻松进行蛋白质序列的系统发育分析。此外,没有标准来选择合适的 k,并且 k 对结果和计算复杂度有很大的影响。在本文中,利用复合 k-mer 自然向量来量化每个蛋白质序列。对三个蛋白质数据集进行的系统发育分析的结果表明,我们的新方法可以精确地描述蛋白质的进化关系,并大大提高计算效率。

相似文献

1
Phylogenetic analysis of protein sequences based on a novel k-mer natural vector method.基于新型 k-mer 自然向量方法的蛋白质序列系统发育分析。
Genomics. 2019 Dec;111(6):1298-1305. doi: 10.1016/j.ygeno.2018.08.010. Epub 2018 Sep 5.
2
K-mer natural vector and its application to the phylogenetic analysis of genetic sequences.K- -mer 自然向量及其在遗传序列系统发育分析中的应用。
Gene. 2014 Aug 1;546(1):25-34. doi: 10.1016/j.gene.2014.05.043. Epub 2014 May 22.
3
k-mer sparse matrix model for genetic sequence and its applications in sequence comparison.用于基因序列的k-mer稀疏矩阵模型及其在序列比较中的应用
J Theor Biol. 2014 Dec 21;363:145-50. doi: 10.1016/j.jtbi.2014.08.028. Epub 2014 Aug 23.
4
kmer2vec: A Novel Method for Comparing DNA Sequences by word2vec Embedding.kmer2vec:一种基于 word2vec 嵌入的 DNA 序列比较新方法。
J Comput Biol. 2022 Sep;29(9):1001-1021. doi: 10.1089/cmb.2021.0536. Epub 2022 May 20.
5
An alignment-free measure based on physicochemical properties of amino acids for protein sequence comparison.一种基于氨基酸理化性质的序列比对无标度测度方法。
Comput Biol Chem. 2019 Jun;80:10-15. doi: 10.1016/j.compbiolchem.2019.01.005. Epub 2019 Jan 18.
6
KINN: An alignment-free accurate phylogeny reconstruction method based on inner distance distributions of k-mer pairs in biological sequences.KINN:一种基于生物序列中k-mer对的内部距离分布的无比对精确系统发育重建方法。
Mol Phylogenet Evol. 2023 Feb;179:107662. doi: 10.1016/j.ympev.2022.107662. Epub 2022 Nov 11.
7
Optimizing Spaced k-mer Neighbors for Efficient Filtration in Protein Similarity Search.优化间隔k-mer邻居以实现蛋白质相似性搜索中的高效筛选
IEEE/ACM Trans Comput Biol Bioinform. 2014 Mar-Apr;11(2):398-406. doi: 10.1109/TCBB.2014.2306831.
8
Exploring the dynamic variations of viral genomes via a novel genetic network.通过一种新的遗传网络探索病毒基因组的动态变化。
Mol Phylogenet Evol. 2022 Oct;175:107583. doi: 10.1016/j.ympev.2022.107583. Epub 2022 Jul 8.
9
An Information-Entropy Position-Weighted -Mer Relative Measure for Whole Genome Phylogeny Reconstruction.一种用于全基因组系统发育重建的信息熵位置加权-mer相对度量
Front Genet. 2021 Oct 22;12:766496. doi: 10.3389/fgene.2021.766496. eCollection 2021.
10
Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.用于系统发育树重建的统计一致k-mer方法
J Comput Biol. 2017 Feb;24(2):153-171. doi: 10.1089/cmb.2015.0216. Epub 2016 Jul 7.

引用本文的文献

1
Energy entropy vector: a novel approach for efficient microbial genomic sequence analysis and classification.能量熵向量:一种用于高效微生物基因组序列分析和分类的新方法。
Brief Bioinform. 2025 Sep 6;26(5). doi: 10.1093/bib/bbaf459.
2
A survey of k-mer methods and applications in bioinformatics.生物信息学中k-mer方法及其应用综述。
Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.
3
Classification of Protein Sequences by a Novel Alignment-Free Method on Bacterial and Virus Families.
基于新型无比对方法对细菌和病毒家族的蛋白质序列分类。
Genes (Basel). 2022 Sep 27;13(10):1744. doi: 10.3390/genes13101744.
4
Bioinformatics approaches for classification and investigation of the evolution of the Na/K-ATPase alpha-subunit.生物信息学方法在钠离子/钾离子-ATP 酶 α 亚基分类和进化研究中的应用。
BMC Ecol Evol. 2022 Oct 26;22(1):122. doi: 10.1186/s12862-022-02071-0.
5
Organizing the bacterial annotation space with amino acid sequence embeddings.利用氨基酸序列嵌入来组织细菌注释空间。
BMC Bioinformatics. 2022 Sep 23;23(1):385. doi: 10.1186/s12859-022-04930-5.
6
Exploring the dynamic variations of viral genomes via a novel genetic network.通过一种新的遗传网络探索病毒基因组的动态变化。
Mol Phylogenet Evol. 2022 Oct;175:107583. doi: 10.1016/j.ympev.2022.107583. Epub 2022 Jul 8.
7
An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids.一种基于氨基酸理化性质的精确无对齐蛋白质序列比较器。
Sci Rep. 2022 Jul 1;12(1):11158. doi: 10.1038/s41598-022-15266-8.
8
FEGS: a novel feature extraction model for protein sequences and its applications.FEGS:一种用于蛋白质序列的新型特征提取模型及其应用。
BMC Bioinformatics. 2021 Jun 3;22(1):297. doi: 10.1186/s12859-021-04223-3.
9
Residue Cluster Classes: A Unified Protein Representation for Efficient Structural and Functional Classification.残基簇类别:一种用于高效结构和功能分类的统一蛋白质表示法。
Entropy (Basel). 2020 Apr 20;22(4):472. doi: 10.3390/e22040472.
10
SWSPM: A Novel Alignment-Free DNA Comparison Method Based on Signal Processing Approaches.SWSPM:一种基于信号处理方法的新型无比对DNA比较方法。
Evol Bioinform Online. 2019 May 30;15:1176934319849071. doi: 10.1177/1176934319849071. eCollection 2019.