School of Computer and Information Engineering, Jiangxi Normal University, Nanchang, 330022, China.
School of Life Science, Jiangxi Normal University, Nanchang, 330022, China.
BMC Genomics. 2024 Nov 18;25(1):1097. doi: 10.1186/s12864-024-11006-6.
Quantifying the features of mitochondrial genome structural variation is crucial for understanding its contribution to complexity. Accurate quantification and interpretation of organizational diversity can help uncover biological evolutionary laws and patterns. The current qMGR approach accumulates the changes in two adjacent genes to calculate the rearrangement frequency RF of each single gene and the rearrangement score RS for specific taxa in the mitogenomes of a given taxonomic group. However, it may introduce bias, as it assigns scores to adjacent genes rather than to rearranged genes. To overcome this limitation, we propose a novel statistical method called qGO to quantify the diversity of gene organization. The qGO method, which is based on the homology of gene order, provides a more accurate representation of genome organizational diversity by partitioning gene strings and individually assigning weights to genes spanning different regions. Additionally, a comprehensive approach is employed for distance computation, generating an extensive matrix of rearrangement distances. Through experiments on more than 5500 vertebrate mitochondrial genomes, we demonstrated that the qGO method outperforms existing methods in terms of accuracy and interpretability. This method improves the comparability of genomes and allows a more accurate comparison of the diversity of mitochondrial genome organization across taxa. These findings have significant implications for unraveling genome evolution, exploring genome function, and investigating the process of molecular evolution.
量化线粒体基因组结构变异的特征对于理解其对复杂性的贡献至关重要。准确地量化和解释组织多样性可以帮助揭示生物进化的规律和模式。当前的 qMGR 方法通过累积两个相邻基因的变化来计算每个单基因的重排频率 RF 和特定分类群线粒体基因组中特定分类群的重排得分 RS。然而,它可能会引入偏差,因为它会给相邻基因分配分数,而不是给重排基因分配分数。为了克服这一限制,我们提出了一种新的统计方法 qGO,用于量化基因组织的多样性。qGO 方法基于基因顺序的同源性,通过分割基因字符串并为跨越不同区域的基因单独分配权重,提供了对基因组组织多样性的更准确表示。此外,还采用了一种全面的方法进行距离计算,生成了广泛的重排距离矩阵。通过对超过 5500 个脊椎动物线粒体基因组的实验,我们证明 qGO 方法在准确性和可解释性方面优于现有方法。该方法提高了基因组的可比性,并允许更准确地比较不同分类群线粒体基因组组织的多样性。这些发现对于揭示基因组进化、探索基因组功能以及研究分子进化过程具有重要意义。