Center for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX, USA.
Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, USA.
BMC Bioinformatics. 2022 Nov 19;23(1):497. doi: 10.1186/s12859-022-05021-1.
Tandem repeats (TR), highly variable genomic variants, are widely used in individual identification, disease diagnostics, and evolutionary studies. The recent advances in sequencing technologies and bioinformatic tools facilitate calling TR haplotypes genome widely. Both length-based and sequence-based TR alleles are used in different applications. However, sequence-based TR alleles could provide the highest precision in characterizing TR haplotypes. The need to identify the differences at the single nucleotide level between or among TR haplotypes with an easy-use bioinformatic tool is essential.
In this study, we developed a Universal STR Allele Toolkit (USAT) for TR haplotype analysis, which takes TR haplotype output from existing tools to perform allele size conversion, sequence comparison of haplotypes, figure plotting, comparison for allele distribution, and interactive visualization. An exemplary application of USAT for analysis of the CODIS core STR loci for DNA forensics with benchmarking human individuals demonstrated the capabilities of USAT. USAT has user-friendly graphic interfaces and runs fast in major computing operating systems with parallel computing enabled.
USAT is a user-friendly bioinformatics software for interpretation, visualization, and comparisons of TRs.
串联重复(TR)是高度可变的基因组变体,广泛应用于个体识别、疾病诊断和进化研究。测序技术和生物信息学工具的最新进展促进了 TR 单倍型基因组的广泛调用。基于长度和基于序列的 TR 等位基因在不同的应用中都有使用。然而,基于序列的 TR 等位基因可以在表征 TR 单倍型方面提供最高的精度。因此,需要使用易于使用的生物信息学工具来识别 TR 单倍型之间或之间在单核苷酸水平上的差异。
在这项研究中,我们开发了一种用于 TR 单倍型分析的通用 STR 等位基因工具包(USAT),它采用现有工具输出的 TR 单倍型来执行等位基因大小转换、单倍型序列比较、图形绘制、等位基因分布比较和交互式可视化。USAT 在用于法医 DNA 分析 CODIS 核心 STR 基因座的示例应用中,展示了其对人类个体进行基准测试的能力。USAT 具有用户友好的图形界面,在启用并行计算的主要计算操作系统中运行速度快。
USAT 是一种用于解释、可视化和比较 TR 的用户友好型生物信息学软件。