分段 K-mer 及其在线粒体基因组序列相似性分析中的应用。

Segmented K-mer and its application on similarity analysis of mitochondrial genome sequences.

机构信息

Department of Mathematics, School of Science, Anhui Science and Technology University, Fengyang, Anhui 233100, China.

出版信息

Gene. 2013 Apr 15;518(2):419-24. doi: 10.1016/j.gene.2012.12.079. Epub 2013 Jan 23.

DOI:10.1016/j.gene.2012.12.079

Abstract

K-mer-based approach has been widely used in similarity analyses so as to discover similarity/dissimilarity among different biological sequences. In this study, we have improved the traditional K-mer method, and introduce a segmented K-mer approach (s-K-mer). After each primary sequence is divided into several segments, we simultaneously transform all these segments into corresponding K-mer-based vectors. In this approach, it is vital how to determine the optimal combination of distance metric with the number of K and the number of segments, i.e., (K(⁎), s(⁎), and d(⁎)). Based on the cascaded feature vectors transformed from s(⁎) segmented sequences, we analyze 34 mammalian genome sequences using the proposed s-K-mer approach. Meanwhile, we compare the results of s-K-mer with those of traditional K-mer. The contrastive analysis results demonstrate that s-K-mer approach outperforms the traditionally K-mer method on similarity analysis among different species.

摘要

基于 K -mer 的方法已被广泛应用于相似性分析，以发现不同生物序列之间的相似性/差异性。在本研究中，我们改进了传统的 K-mer 方法，并引入了分段 K-mer 方法（s-K-mer）。在将每个主要序列划分为几个片段后，我们同时将所有这些片段转换为相应的基于 K-mer 的向量。在这种方法中，如何确定距离度量与 K 的数量和片段的数量的最佳组合（K（⁎）、s（⁎）和 d（⁎））至关重要。基于从 s（⁎）分段序列转换的级联特征向量，我们使用提出的 s-K-mer 方法分析了 34 种哺乳动物基因组序列。同时，我们将 s-K-mer 的结果与传统 K-mer 的结果进行了比较。对比分析结果表明，s-K-mer 方法在不同物种之间的相似性分析方面优于传统的 K-mer 方法。

相似文献

Segmented K-mer and its application on similarity analysis of mitochondrial genome sequences.分段 K-mer 及其在线粒体基因组序列相似性分析中的应用。

Gene. 2013 Apr 15;518(2):419-24. doi: 10.1016/j.gene.2012.12.079. Epub 2013 Jan 23.

Analysis of common k-mers for whole genome sequences using SSB-tree.使用SSB树对全基因组序列的常见k-mer进行分析。

Genome Inform. 2002;13:30-41.

Alignment-free approaches for predicting novel Nuclear Mitochondrial Segments (NUMTs) in the human genome.无比对方法预测人类基因组中的新型核线粒体片段（NUMTs）。

Gene. 2019 Apr 5;691:141-152. doi: 10.1016/j.gene.2018.12.040. Epub 2019 Jan 8.

K-mer natural vector and its application to the phylogenetic analysis of genetic sequences.K- -mer 自然向量及其在遗传序列系统发育分析中的应用。

Gene. 2014 Aug 1;546(1):25-34. doi: 10.1016/j.gene.2014.05.043. Epub 2014 May 22.

A simple k-word interval method for phylogenetic analysis of DNA sequences.一种简单的 K 字区间方法用于 DNA 序列的系统发育分析。

J Theor Biol. 2013 Jan 21;317:192-9. doi: 10.1016/j.jtbi.2012.10.010. Epub 2012 Oct 18.

Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.用于系统发育树重建的统计一致k-mer方法

J Comput Biol. 2017 Feb;24(2):153-171. doi: 10.1089/cmb.2015.0216. Epub 2016 Jul 7.

Optimal choice of k-mer in composition vector method for genome sequence comparison.在基因组序列比较的组成向量方法中，k-mer 的最佳选择。

Genomics. 2018 Sep;110(5):263-273. doi: 10.1016/j.ygeno.2017.11.003. Epub 2017 Nov 24.

Genome classification improvements based on k-mer intervals in sequences.基于序列中 k-mer 间隔的基因组分类改进。

Genomics. 2019 Dec;111(6):1574-1582. doi: 10.1016/j.ygeno.2018.11.001. Epub 2018 Nov 13.

Determination of k-mer density in a DNA sequence and subsequent cluster formation algorithm based on the application of electronic filter.确定 DNA 序列中的 k--mer 密度，并随后基于电子滤波器的应用形成聚类算法。

Sci Rep. 2021 Jul 1;11(1):13701. doi: 10.1038/s41598-021-93154-3.

An ensemble distance measure of k-mer and Natural Vector for the phylogenetic analysis of multiple-segmented viruses.用于多节段病毒系统发育分析的k-mer和自然向量的整体距离度量

J Theor Biol. 2016 Jun 7;398:136-44. doi: 10.1016/j.jtbi.2016.03.004. Epub 2016 Mar 10.

引用本文的文献

Exploring objective feature sets in constructing the evolution relationship of animal genome sequences.探索构建动物基因组序列进化关系中的客观特征集。

BMC Genomics. 2023 Oct 24;24(1):634. doi: 10.1186/s12864-023-09747-x.

The Machine-Learning-Mediated Interface of Microbiome and Genetic Risk Stratification in Neuroblastoma Reveals Molecular Pathways Related to Patient Survival.机器学习介导的神经母细胞瘤微生物组与遗传风险分层的界面揭示了与患者生存相关的分子途径。

Cancers (Basel). 2022 Jun 10;14(12):2874. doi: 10.3390/cancers14122874.

Evolutionary mechanism and biological functions of 8-mers containing CG dinucleotide in yeast.酵母中含CG二核苷酸的八聚体的进化机制及生物学功能

Chromosome Res. 2017 Jun;25(2):173-189. doi: 10.1007/s10577-017-9554-z. Epub 2017 Feb 9.

K-mer natural vector and its application to the phylogenetic analysis of genetic sequences.K- -mer 自然向量及其在遗传序列系统发育分析中的应用。

Gene. 2014 Aug 1;546(1):25-34. doi: 10.1016/j.gene.2014.05.043. Epub 2014 May 22.

Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica.用于检测肠炎沙门氏菌暴发的全基因组测序评估

PLoS One. 2014 Feb 4;9(2):e87991. doi: 10.1371/journal.pone.0087991. eCollection 2014.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

分段 K-mer 及其在线粒体基因组序列相似性分析中的应用。

Segmented K-mer and its application on similarity analysis of mitochondrial genome sequences.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献