Wu Yao-Qun, Yu Zu-Guo, Tang Run-Bin, Han Guo-Sheng, Anh Vo V
Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Hunan, China.
Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China.
Front Genet. 2021 Oct 22;12:766496. doi: 10.3389/fgene.2021.766496. eCollection 2021.
Alignment methods have faced disadvantages in sequence comparison and phylogeny reconstruction due to their high computational costs in handling time and space complexity. On the other hand, alignment-free methods incur low computational costs and have recently gained popularity in the field of bioinformatics. Here we propose a new alignment-free method for phylogenetic tree reconstruction based on whole genome sequences. A key component is a measure called (IEPWRMkmer), which combines the position-weighted measure of -mers proposed by our group and the information entropy of frequency of -mers. The Manhattan distance is used to calculate the pairwise distance between species. Finally, we use the Neighbor-Joining method to construct the phylogenetic tree. To evaluate the performance of this method, we perform phylogenetic analysis on two datasets used by other researchers. The results demonstrate that the method is efficient and reliable. The source codes of our method are provided at https://github.com/ wuyaoqun37/IEPWRMkmer.
由于在处理时间和空间复杂性方面计算成本高昂,比对方法在序列比较和系统发育重建中面临劣势。另一方面,无比对方法计算成本低,最近在生物信息学领域受到欢迎。在此,我们提出一种基于全基因组序列的用于系统发育树重建的新无比对方法。一个关键组件是一种名为(IEPWRMkmer)的度量,它结合了我们团队提出的k聚体的位置加权度量和k聚体频率的信息熵。曼哈顿距离用于计算物种之间的成对距离。最后,我们使用邻接法构建系统发育树。为评估该方法的性能,我们对其他研究人员使用的两个数据集进行了系统发育分析。结果表明该方法高效且可靠。我们方法的源代码可在https://github.com/ wuyaoqun37/IEPWRMkmer获取。