Jighly Abdulqader, Hamwieh Aladdin, Ogbonnaya Francis C
International Center for Agricultural Research in the Dry Areas (ICARDA), P,O, Box 5466, Aleppo, Syria.
BMC Res Notes. 2011 Jul 20;4:239. doi: 10.1186/1756-0500-4-239.
Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships.
To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm.
The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
微卫星,即简单序列重复(SSR),是串联重复的DNA序列,包括不超过六个碱基的特定序列的串联拷贝,分布于基因组中。SSR因其易于检测,已被用作分子标记,并应用于一系列领域,包括遗传多样性、基因组作图和标记辅助选择。由于DNA复制过程中DNA聚合酶的滑动,它也极易发生突变。这种独特的突变使插入/缺失(INDEL)突变频率大幅提高,高于其他类型的分子标记,如单核苷酸多态性(SNP)。SNP比INDEL更常见。因此,所有用于序列比对的设计算法都适用于绝大多数基因组序列,而不考虑微卫星区域,将其视为需要特殊考虑的独特序列。旧算法在应用中存在局限性,因为不同重复单元之间存在许多重叠,导致错误的进化关系。
为克服比对算法在处理SSR位点时的局限性,使用带有Tk图形界面的PERL脚本开发了一种新算法。该程序基于首先确定重复单元以及最后一个SSR核苷酸位置后进行序列比对。这会根据插入的重复单元类型导致一个移位过程。在研究应用新算法前后的系统发育关系时,通过增加SSR长度和复杂性,在树状图中获得了许多差异。然而,应用新算法后,不同谱系之间的距离更小。
新算法在比对SSR位点时能产生更好的估计结果,因为它反映了不同谱系之间更可靠的进化关系。它减少了SSR比对过程中的重叠,从而产生更符合实际的系统发育关系。