Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Australia.
Department of Medical Biology, The University of Melbourne, 1G Royal Parade, Parkville, VIC, 3052, Australia.
Sci Rep. 2022 Jul 30;12(1):13124. doi: 10.1038/s41598-022-17267-z.
Bioinformatic methods for detecting short tandem repeat expansions in short-read sequencing have identified new repeat expansions in humans, but require alignment information to identify repetitive motif enrichment at genomic locations. We present superSTR, an ultrafast method that does not require alignment. superSTR is used to process whole-genome and whole-exome sequencing data, and perform the first STR analysis of the UK Biobank, efficiently screening and identifying known and potential disease-associated STRs in the exomes of 49,953 biobank participants. We demonstrate the first bioinformatic screening of RNA sequencing data to detect repeat expansions in humans and mouse models of ataxia and dystrophy.
用于在短读测序中检测短串联重复扩展的生物信息学方法已经在人类中发现了新的重复扩展,但需要对齐信息来识别基因组位置上的重复基序富集。我们提出了 superSTR,一种不需要对齐的超快方法。superSTR 用于处理全基因组和全外显子组测序数据,并对英国生物库进行了首次 STR 分析,有效地筛选和鉴定了 49953 名生物库参与者外显子中的已知和潜在疾病相关 STR。我们展示了首次对 RNA 测序数据进行生物信息学筛选,以检测人类和共济失调和肌肉萎缩症小鼠模型中的重复扩展。