Suppr超能文献

使用GRMhor算法进行高效的基因组单体高阶结构注释与识别。

Efficient genome monomer higher-order structure annotation and identification using the GRMhor algorithm.

作者信息

Glunčić Matko, Barić Domjan, Paar Vladimir

机构信息

Faculty of Science, University of Zagreb, Zagreb 10000, Croatia.

Department of Mathematical, Physical and Chemical Sciences, Croatian Academy of Sciences and Arts, Zagreb 10000, Croatia.

出版信息

Bioinform Adv. 2024 Nov 28;4(1):vbae191. doi: 10.1093/bioadv/vbae191. eCollection 2024.

Abstract

MOTIVATION

Tandem monomeric units, integral components of eukaryotic genomes, form higher-order repeat (HOR) structures that play crucial roles in maintaining chromosome integrity and regulating gene expression and protein abundance. Given their significant influence on processes such as evolution, chromosome segregation, and disease, developing a sensitive and automated tool for identifying HORs across diverse genomic sequences is essential.

RESULTS

In this study, we applied the GRMhor (Global Repeat Map hor) algorithm to analyse the centromeric region of chromosome 20 in three individual human genomes, as well as in the centromeric regions of three higher primates. In all three human genomes, we identified six distinct HOR arrays, which revealed significantly greater differences in the number of canonical and variant copies, as well as in their overall structure, than would be expected given the 99.9% genetic similarity among humans. Furthermore, our analysis of higher primate genomes, which revealed entirely different HOR sequences, indicates a much larger genomic divergence between humans and higher primates than previously recognized. These results underscore the suitability of the GRMhor algorithm for studying specificities in individual genomes, particularly those involving repetitive monomers in centromere structure, which is essential for proper chromosome segregation during cell division, while also highlighting its utility in exploring centromere evolution and other repetitive genomic regions.

AVAILABILITY AND IMPLEMENTATION

Source code and example binaries freely available for download at github.com/gluncic/GRM2023.

摘要

动机

串联单体单元作为真核生物基因组的组成部分,形成了高阶重复(HOR)结构,这些结构在维持染色体完整性、调节基因表达和蛋白质丰度方面发挥着关键作用。鉴于它们对进化、染色体分离和疾病等过程有重大影响,开发一种灵敏且自动化的工具来识别不同基因组序列中的HOR至关重要。

结果

在本研究中,我们应用GRMhor(全局重复图谱HOR)算法分析了三个人类个体基因组中20号染色体的着丝粒区域,以及三种高等灵长类动物的着丝粒区域。在所有三个人类基因组中,我们识别出六个不同的HOR阵列,这些阵列显示出在标准拷贝数和变异拷贝数及其整体结构上的差异,比考虑到人类之间99.9%的遗传相似性所预期的要大得多。此外,我们对高等灵长类动物基因组的分析揭示了完全不同的HOR序列,这表明人类与高等灵长类动物之间的基因组差异比之前认识到的要大得多。这些结果强调了GRMhor算法适用于研究个体基因组中的特异性,特别是那些涉及着丝粒结构中重复单体的特异性,这对于细胞分裂过程中正确染色体分离至关重要,同时也突出了其在探索着丝粒进化和其他重复基因组区域方面的实用性。

可用性和实现方式

源代码和示例二进制文件可在github.com/gluncic/GRM2023上免费下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/931e/11630843/261805085887/vbae191f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验