Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany.
Genes (Basel). 2021 Oct 1;12(10):1571. doi: 10.3390/genes12101571.
Several strongly conserved DNA sequence patterns in and between introns and intergenic regions (IIRs) consisting of short tandem repeats (STRs) with repeat lengths <3 bp have already been described in the kingdom of . In this work, we expanded the search and analysis of conserved DNA sequence patterns to a wider range of genomes. Our aims were to confirm the conservation of these patterns, to support the hypothesis on their functional constraints and/or the identification of unknown patterns. We pairwise compared genomic DNA sequences of genes, exons, CDS, introns and intergenic regions of 34 (land plants), 30 and 29 using established -mer-based (alignment-free) comparison methods. Additionally, the results were compared with values derived for in former studies. We confirmed strong correlations between the sequence structures of IIRs spanning over the entire domain of . We found that the high correlations within introns, intergenic regions and between the two are a result of conserved abundancies of STRs with repeat units ≤2 bp (e.g., (AT)n). For some sequence patterns and their inverse complementary sequences, we found a violation of equal distribution on complementary DNA strands in a subset of genomes. Looking at mismatches within the identified STR patterns, we found specific preferences for certain nucleotides stable over all four phylogenetic kingdoms. We conclude that all of these conserved patterns between IIRs indicate a shared function of these sequence structures related to STRs.
已经在域中描述了内含子和基因间区(IIRs)内和之间的几个由短串联重复(STR)组成的高度保守的 DNA 序列模式,这些 STR 的重复长度<3bp。在这项工作中,我们将保守的 DNA 序列模式的搜索和分析扩展到更广泛的范围。我们的目的是确认这些模式的保守性,支持它们的功能约束假说和/或鉴定未知的模式。我们使用已建立的基于-mer 的(无比对)比较方法,对 34 个(陆地植物)、30 个和 29 个的基因、外显子、CDS、内含子和基因间区的基因组 DNA 序列进行了两两比较。此外,将结果与以前研究中为得出的值进行了比较。我们证实了跨越整个域的 IIRs 的序列结构之间存在很强的相关性。我们发现,内含子、基因间区和两者之间的高度相关性是由于重复单元≤2bp 的 STR (例如(AT)n)的丰富度保持不变所致。对于一些序列模式及其互补序列,我们在一部分基因组中发现互补 DNA 链上存在分配不均的情况。观察鉴定出的 STR 模式内的错配,我们发现特定核苷酸在所有四个系统发育域中都具有稳定的偏好。我们得出的结论是,IIRs 之间的所有这些保守模式都表明这些与 STR 相关的序列结构具有共同的功能。