Department of Genetics & Genome Biology, University of Leicester, University Road, Leicester LE1 7RH, UK.
Department of Genetics & Genome Biology, University of Leicester, University Road, Leicester LE1 7RH, UK.
Forensic Sci Int Genet. 2018 Jul;35:97-106. doi: 10.1016/j.fsigen.2018.03.012. Epub 2018 Apr 12.
Short tandem repeats on the male-specific region of the Y chromosome (Y-STRs) are permanently linked as haplotypes, and therefore Y-STR sequence diversity can be considered within the robust framework of a phylogeny of haplogroups defined by single nucleotide polymorphisms (SNPs). Here we use massively parallel sequencing (MPS) to analyse the 23 Y-STRs in Promega's prototype PowerSeq™ Auto/Mito/Y System kit (containing the markers of the PowerPlex® Y23 [PPY23] System) in a set of 100 diverse Y chromosomes whose phylogenetic relationships are known from previous megabase-scale resequencing. Including allele duplications and alleles resulting from likely somatic mutation, we characterised 2311 alleles, demonstrating 99.83% concordance with capillary electrophoresis (CE) data on the same sample set. The set contains 267 distinct sequence-based alleles (an increase of 58% compared to the 169 detectable by CE), including 60 novel Y-STR variants phased with their flanking sequences which have not been reported previously to our knowledge. Variation includes 46 distinct alleles containing non-reference variants of SNPs/indels in both repeat and flanking regions, and 145 distinct alleles containing repeat pattern variants (RPV). For DYS385a,b, DYS481 and DYS390 we observed repeat count variation in short flanking segments previously considered invariable, and suggest new MPS-based structural designations based on these. We considered the observed variation in the context of the Y phylogeny: several specific haplogroup associations were observed for SNPs and indels, reflecting the low mutation rates of such variant types; however, RPVs showed less phylogenetic coherence and more recurrence, reflecting their relatively high mutation rates. In conclusion, our study reveals considerable additional diversity at the Y-STRs of the PPY23 set via MPS analysis, demonstrates high concordance with CE data, facilitates nomenclature standardisation, and places Y-STR sequence variants in their phylogenetic context.
短串联重复序列在 Y 染色体的男性特异性区域(Y-STR)上永久连接为单倍型,因此 Y-STR 序列多样性可以在单核苷酸多态性(SNP)定义的单倍群系统发育的稳健框架内进行考虑。在这里,我们使用大规模平行测序(MPS)分析 Promega 原型 PowerSeq™ Auto/Mito/Y 系统试剂盒(包含 PowerPlex® Y23 [PPY23] 系统的标记)中的 23 个 Y-STR,该试剂盒包含 100 个不同的 Y 染色体,这些 Y 染色体的系统发育关系已知来自之前的兆碱基规模重测序。包括等位基因重复和可能来自体细胞突变的等位基因,我们对 2311 个等位基因进行了特征描述,与同一样本集的毛细管电泳(CE)数据的一致性达到 99.83%。该集合包含 267 个不同的基于序列的等位基因(与 CE 可检测到的 169 个相比增加了 58%),其中包括 60 个新的 Y-STR 变体,这些变体与侧翼序列成相位,据我们所知,这些变体以前尚未报道过。变异包括 46 个不同的等位基因,包含重复和侧翼区域中 SNP/indel 的非参考变体,以及 145 个不同的等位基因,包含重复模式变体(RPV)。对于 DYS385a,b,DYS481 和 DYS390,我们观察到以前认为不变的短侧翼片段中的重复计数变化,并基于此提出了新的基于 MPS 的结构命名。我们在 Y 系统发育的背景下考虑了观察到的变异:对于 SNP 和 indel,观察到几个特定的单倍群关联,反映了这种变异类型的低突变率;然而,RPV 显示出较少的系统发育一致性和更多的复发,反映了它们相对较高的突变率。总之,我们的研究通过 MPS 分析揭示了 PPY23 集合中 Y-STR 的大量额外多样性,与 CE 数据具有高度一致性,促进了命名标准化,并将 Y-STR 序列变体置于其系统发育背景下。