• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于全基因组系统发育重建的信息熵位置加权-mer相对度量

An Information-Entropy Position-Weighted -Mer Relative Measure for Whole Genome Phylogeny Reconstruction.

作者信息

Wu Yao-Qun, Yu Zu-Guo, Tang Run-Bin, Han Guo-Sheng, Anh Vo V

机构信息

Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan, China.

Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China.

出版信息

Front Genet. 2021 Oct 22;12:766496. doi: 10.3389/fgene.2021.766496. eCollection 2021.

DOI:10.3389/fgene.2021.766496
PMID:34745231
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8568955/
Abstract

Alignment methods have faced disadvantages in sequence comparison and phylogeny reconstruction due to their high computational costs in handling time and space complexity. On the other hand, alignment-free methods incur low computational costs and have recently gained popularity in the field of bioinformatics. Here we propose a new alignment-free method for phylogenetic tree reconstruction based on whole genome sequences. A key component is a measure called (IEPWRMkmer), which combines the position-weighted measure of -mers proposed by our group and the information entropy of frequency of -mers. The Manhattan distance is used to calculate the pairwise distance between species. Finally, we use the Neighbor-Joining method to construct the phylogenetic tree. To evaluate the performance of this method, we perform phylogenetic analysis on two datasets used by other researchers. The results demonstrate that the method is efficient and reliable. The source codes of our method are provided at https://github.com/ wuyaoqun37/IEPWRMkmer.

摘要

由于在处理时间和空间复杂性方面计算成本高昂,比对方法在序列比较和系统发育重建中面临劣势。另一方面,无比对方法计算成本低,最近在生物信息学领域受到欢迎。在此,我们提出一种基于全基因组序列的用于系统发育树重建的新无比对方法。一个关键组件是一种名为(IEPWRMkmer)的度量,它结合了我们团队提出的k聚体的位置加权度量和k聚体频率的信息熵。曼哈顿距离用于计算物种之间的成对距离。最后,我们使用邻接法构建系统发育树。为评估该方法的性能,我们对其他研究人员使用的两个数据集进行了系统发育分析。结果表明该方法高效且可靠。我们方法的源代码可在https://github.com/ wuyaoqun37/IEPWRMkmer获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/efdd9c871c0a/fgene-12-766496-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/650de0b49c42/fgene-12-766496-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/15c302805d9a/fgene-12-766496-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/c2fb26ce8dff/fgene-12-766496-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/2b4812fa2cee/fgene-12-766496-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/80d166ec23b6/fgene-12-766496-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/cb7a4b19cabb/fgene-12-766496-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/efdd9c871c0a/fgene-12-766496-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/650de0b49c42/fgene-12-766496-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/15c302805d9a/fgene-12-766496-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/c2fb26ce8dff/fgene-12-766496-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/2b4812fa2cee/fgene-12-766496-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/80d166ec23b6/fgene-12-766496-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/cb7a4b19cabb/fgene-12-766496-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f393/8568955/efdd9c871c0a/fgene-12-766496-g007.jpg

相似文献

1
An Information-Entropy Position-Weighted -Mer Relative Measure for Whole Genome Phylogeny Reconstruction.一种用于全基因组系统发育重建的信息熵位置加权-mer相对度量
Front Genet. 2021 Oct 22;12:766496. doi: 10.3389/fgene.2021.766496. eCollection 2021.
2
KINN: An alignment-free accurate phylogeny reconstruction method based on inner distance distributions of k-mer pairs in biological sequences.KINN:一种基于生物序列中k-mer对的内部距离分布的无比对精确系统发育重建方法。
Mol Phylogenet Evol. 2023 Feb;179:107662. doi: 10.1016/j.ympev.2022.107662. Epub 2022 Nov 11.
3
CGRWDL: alignment-free phylogeny reconstruction method for viruses based on chaos game representation weighted by dynamical language model.CGRWDL:基于动态语言模型加权混沌博弈表示的病毒无比对系统发育重建方法
Front Microbiol. 2024 Mar 20;15:1339156. doi: 10.3389/fmicb.2024.1339156. eCollection 2024.
4
kmer2vec: A Novel Method for Comparing DNA Sequences by word2vec Embedding.kmer2vec:一种基于 word2vec 嵌入的 DNA 序列比较新方法。
J Comput Biol. 2022 Sep;29(9):1001-1021. doi: 10.1089/cmb.2021.0536. Epub 2022 May 20.
5
Phylogenetic Analysis of HIV-1 Genomes Based on the Position-Weighted K-mers Method.基于位置加权k-mer方法的HIV-1基因组系统发育分析
Entropy (Basel). 2020 Feb 23;22(2):255. doi: 10.3390/e22020255.
6
Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.用于系统发育树重建的统计一致k-mer方法
J Comput Biol. 2017 Feb;24(2):153-171. doi: 10.1089/cmb.2015.0216. Epub 2016 Jul 7.
7
Whole genome/proteome based phylogeny reconstruction for prokaryotes using higher order Markov model and chaos game representation.使用高阶马尔可夫模型和混沌博弈表示法对原核生物进行基于全基因组/蛋白质组的系统发育重建。
Mol Phylogenet Evol. 2016 Mar;96:102-111. doi: 10.1016/j.ympev.2015.12.011. Epub 2015 Dec 24.
8
KITSUNE: A Tool for Identifying Empirically Optimal K-mer Length for Alignment-Free Phylogenomic Analysis.KITSUNE:一种用于为无比对系统发育基因组分析确定经验最优k-mer长度的工具。
Front Bioeng Biotechnol. 2020 Sep 23;8:556413. doi: 10.3389/fbioe.2020.556413. eCollection 2020.
9
An alignment-free method for detection of missing regions for phylogenetic analysis.一种用于系统发育分析中缺失区域检测的无比对方法。
Heliyon. 2024 Jun 4;10(11):e32227. doi: 10.1016/j.heliyon.2024.e32227. eCollection 2024 Jun 15.
10
A k-mer-Based Approach for Phylogenetic Classification of Taxa in Environmental Genomic Data.基于 k- -mer 的环境基因组数据中分类单元的系统发育分类方法。
Syst Biol. 2023 Nov 1;72(5):1101-1118. doi: 10.1093/sysbio/syad037.

引用本文的文献

1
An alignment-free method for phylogeny estimation using maximum likelihood.一种使用最大似然法进行系统发育估计的无比对方法。
BMC Bioinformatics. 2025 Mar 7;26(1):77. doi: 10.1186/s12859-025-06080-w.

本文引用的文献

1
Phylogenetic Analysis of HIV-1 Genomes Based on the Position-Weighted K-mers Method.基于位置加权k-mer方法的HIV-1基因组系统发育分析
Entropy (Basel). 2020 Feb 23;22(2):255. doi: 10.3390/e22020255.
2
Encoding and Decoding DNA Sequences by Integer Chaos Game Representation.通过整数混沌游戏表示法对DNA序列进行编码和解码
J Comput Biol. 2019 Feb;26(2):143-151. doi: 10.1089/cmb.2018.0173. Epub 2018 Dec 5.
3
Genomic signal processing for DNA sequence clustering.用于DNA序列聚类的基因组信号处理
PeerJ. 2018 Jan 24;6:e4264. doi: 10.7717/peerj.4264. eCollection 2018.
4
Alignment-free sequence comparison: benefits, applications, and tools.无比对信息的序列比对:优势、应用和工具。
Genome Biol. 2017 Oct 3;18(1):186. doi: 10.1186/s13059-017-1319-7.
5
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.kWIP:k-mer加权内积,一种遗传相似性的从头估计器。
PLoS Comput Biol. 2017 Sep 5;13(9):e1005727. doi: 10.1371/journal.pcbi.1005727. eCollection 2017 Sep.
6
DLTree: efficient and accurate phylogeny reconstruction using the dynamical language method.DLTree:使用动态语言方法进行高效准确的系统发育重建。
Bioinformatics. 2017 Jul 15;33(14):2214-2215. doi: 10.1093/bioinformatics/btx158.
7
Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison.基于混沌游戏表示的DNA序列数值编码及其在相似性比较中的应用
Genomics. 2016 Oct;108(3-4):134-142. doi: 10.1016/j.ygeno.2016.08.002. Epub 2016 Aug 15.
8
ALFRED: A Practical Method for Alignment-Free Distance Computation.阿尔弗雷德:一种无比对距离计算的实用方法。
J Comput Biol. 2016 Jun;23(6):452-60. doi: 10.1089/cmb.2015.0217. Epub 2016 May 3.
9
MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets.MEGA7:适用于更大数据集的分子进化遗传学分析版本7.0
Mol Biol Evol. 2016 Jul;33(7):1870-4. doi: 10.1093/molbev/msw054. Epub 2016 Mar 22.
10
Effect of k-tuple length on sample-comparison with high-throughput sequencing data.k元组长度对高通量测序数据样本比较的影响。
Biochem Biophys Res Commun. 2016 Jan 22;469(4):1021-7. doi: 10.1016/j.bbrc.2015.11.094. Epub 2015 Dec 22.