Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal.
Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal.
Sci Rep. 2023 Jun 24;13(1):10251. doi: 10.1038/s41598-023-32137-y.
Microsatellites, or Short Tandem Repeats (STRs), are subject to frequent length mutations that involve the loss or gain of an integer number of repeats. This work aimed to investigate the correlation between STRs' specific repetitive motif composition and mutational dynamics, specifically the occurrence of single- or multistep mutations. Allelic transmission data, comprising 323,818 allele transfers and 1,297 mutations, were gathered for 35 Y-chromosomal STRs with simple structure. Six structure groups were established: ATT, CTT, TCTA/GATA, GAAA/CTTT, CTTTT, and AGAGAT, according to the repetitive motif present in the DNA leading strand of the markers. Results show that the occurrence of multistep mutations varies significantly among groups of markers defined by the repetitive motif. The group of markers with the highest frequency of multistep mutations was the one with repetitive motif CTTTT (25% of the detected mutations) and the lowest frequency corresponding to the group with repetitive motifs TCTA/GATA (0.93%). Statistically significant differences (α = 0.05) were found between groups with repetitive motifs with different lengths, as is the case of TCTA/GATA and ATT (p = 0.0168), CTT (p < 0.0001) and CTTTT (p < 0.0001), as well as between GAAA/CTTT and CTTTT (p = 0.0102). The same occurred between the two tetrameric groups GAAA/CTTT and TCTA/GATA (p < 0.0001) - the first showing 5.7 times more multistep mutations than the second. When considering the number of repeats of the mutated paternal alleles, statistically significant differences were found for alleles with 10 or 12 repeats, between GATA and ATT structure groups. These results, which demonstrate the heterogeneity of mutational dynamics across repeat motifs, have implications in the fields of population genetics, epidemiology, or phylogeography, and whenever STR mutation models are used in evolutionary studies in general.
微卫星或短串联重复序列(STRs)易发生涉及重复单位整数丢失或获得的长度突变。本研究旨在探讨 STR 特定重复基序组成与突变动力学之间的相关性,特别是单步或多步突变的发生。收集了 35 个具有简单结构的 Y 染色体 STR 的等位基因传递数据,包含 323818 个等位基因转移和 1297 个突变。根据标记 DNA 前导链上存在的重复基序,将标记分为 6 个结构组:ATT、CTT、TCTA/GATA、GAAA/CTTT、CTTTT 和 AGAGAT。结果表明,多步突变的发生在不同重复基序定义的标记组之间存在显著差异。多步突变频率最高的标记组是重复基序为 CTTTT(检测到的突变的 25%),最低频率的标记组是重复基序为 TCTA/GATA(0.93%)。具有不同长度重复基序的标记组之间存在显著差异(α=0.05),例如 TCTA/GATA 和 ATT(p=0.0168)、CTT(p<0.0001)和 CTTTT(p<0.0001),以及 GAAA/CTTT 和 CTTTT(p=0.0102)。同样,两个四聚体组 GAAA/CTTT 和 TCTA/GATA 之间也存在显著差异(p<0.0001)-第一个组的多步突变是第二个组的 5.7 倍。当考虑突变父本等位基因的重复次数时,GATA 和 ATT 结构组之间的 10 或 12 个重复的等位基因存在显著差异。这些结果表明,重复基序之间的突变动力学存在异质性,这在群体遗传学、流行病学或系统地理学等领域以及一般在进化研究中使用 STR 突变模型时具有重要意义。