Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria, Australia.
Curr Protoc. 2024 Nov;4(11):e70010. doi: 10.1002/cpz1.70010.
Short tandem repeats (STRs) and variable-number tandem repeats (VNTRs) are repetitive genomic sequences seen widely throughout the genome. These repeat expansions are currently known to cause ∼60 diseases, with expansions in new loci linked to rare diseases continuing to be discovered. Genome sequencing is an important tool for detecting disease-causing variants and several computational tools have been developed to analyze tandem repeats from genomic data, enabling the genotyping and the identification of expanded alleles. However, guidelines for conducting the analysis of these repeats and, more importantly, for assessing the findings are lacking. Understanding the tools and their technical limitations is important for accurately interpreting the results. This article provides detailed, step-by-step instructions for three key use cases in STR analysis from short-read genome sequencing data, which are also applicable to smaller VNTRs. First, it demonstrates an approach for genotyping known pathogenic loci and the identification of clinically significant expansions. Second, we offer guidance on defining tandem repeat loci and conducting genome-wide genotyping studies, which is also applicable to diploid organisms other than humans. Third, instructions are provided on how to find novel expansions at loci not previously known to be associated with disease, aiding in the discovery of new pathogenic loci. Moreover, we introduce the use of newly-developed helper tools that enable a complete and streamlined tandem repeat analysis protocol by addressing the gaps in current methods. All three protocols are compatible with human hg19, hg38, and the latest telomere-to-telomere (hs1) reference genomes. Additionally, this protocol provides an overview and discussion on how to interpret genotyping results. © 2024 The Author(s). Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Genotyping known pathogenic tandem repeat loci Alternate Protocol: Genotyping known pathogenic tandem repeat loci with STRipy Support Protocol 1: Installation of tools and ExpansionHunter catalog modification Basic Protocol 2: Performing genome-wide genotyping of tandem repeats Basic Protocol 3: Discovering de novo tandem repeat expansions Support Protocol 2: Compiling ExpansionHunter Denovo from source code and generating STR profiles.
短串联重复序列(STRs)和可变数量串联重复序列(VNTRs)是广泛存在于基因组中的重复基因组序列。这些重复扩展目前已知会导致约 60 种疾病,并且新基因座的扩展与罕见疾病有关的扩展仍在不断被发现。基因组测序是检测致病变异的重要工具,已经开发了几种计算工具来分析基因组数据中的串联重复序列,从而实现基因分型和扩增等位基因的鉴定。然而,目前缺乏关于进行这些重复分析的指南,更重要的是,缺乏关于评估分析结果的指南。了解这些工具及其技术局限性对于准确解释结果至关重要。本文提供了来自短读长基因组测序数据的 STR 分析三个关键用例的详细分步说明,这些说明也适用于较小的 VNTR。首先,它演示了一种用于基因分型已知致病基因座和鉴定临床意义上的扩增的方法。其次,我们提供了关于定义串联重复基因座和进行全基因组基因分型研究的指导,这也适用于除人类以外的二倍体生物。第三,提供了如何在以前未知与疾病相关的基因座上发现新的扩增的说明,有助于发现新的致病基因座。此外,我们介绍了使用新开发的辅助工具的方法,这些工具通过解决当前方法中的空白来实现完整和简化的串联重复分析协议。所有三个协议都与人类 hg19、hg38 和最新的端粒到端粒(hs1)参考基因组兼容。此外,本协议还提供了关于如何解释基因分型结果的概述和讨论。 © 2024 作者。Wiley Periodicals LLC 出版的当前方案。基本方案 1:基因分型已知致病性串联重复基因座备选方案 1:使用 STRipy 基因分型已知致病性串联重复基因座支持方案 1:工具安装和 ExpansionHunter 目录修改基本方案 2:进行全基因组串联重复基因分型基本方案 3:发现新的串联重复扩展支持方案 2:从源代码编译 ExpansionHunter Denovo 并生成 STR 图谱。