Science and Technology, Environment and Climate Change Canada, Ottawa, Ontario, Canada.
Biology Department, Trent University, Peterborough, Ontario, Canada.
Mol Ecol Resour. 2024 Apr;24(3):e13929. doi: 10.1111/1755-0998.13929. Epub 2024 Jan 30.
Accurate and efficient microsatellite loci genotyping is an essential process in population genetics that is also used in various demographic analyses. Protocols for next-generation sequencing of microsatellite loci enable high-throughput and cross-compatible allele scoring, common issues that are not addressed by conventional capillary-based approaches. To improve this process, we have developed an all-in-one software, called Seq2Sat (sequence to microsatellite), in C++ to support automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and conducts read quality control before inferring genotypes based on depth-of-read, read ratio, sequence composition and length. We have also developed a module for sex identification based on sex chromosome-specific locus amplicons. To allow for greater user access and complement autoscoring, we developed SatAnalyzer (microsatellite analyzer), a user-friendly web-based platform that conducts reads-to-report analyses by calling Seq2Sat for genotype autoscoring and produces interactive genotype graphs for manual editing. SatAnalyzer also allows users to troubleshoot multiplex optimization by analysing read quality and distribution across loci and samples in support of high-quality library preparation. To evaluate its performance, we benchmarked our toolkit Seq2Sat/SatAnalyzer against a conventional capillary gel method and existing microsatellite genotyping software, MEGASAT, using two datasets. Results showed that SatAnalyzer can achieve >99.70% genotyping accuracy and Seq2Sat is ~5 times faster than MEGASAT despite many more informative tables and figures being generated. Seq2Sat and SatAnalyzer are freely available on github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).
准确高效的微卫星基因座分型是群体遗传学中的一个重要过程,也用于各种人口统计分析。微卫星基因座的下一代测序方案能够实现高通量和交叉兼容的等位基因评分,这是传统毛细管方法无法解决的常见问题。为了改进这一过程,我们用 C++开发了一个全集成软件,称为 Seq2Sat(序列到微卫星),以支持自动化微卫星基因分型。它直接接受微卫星扩增子的原始读数,并在基于读取深度、读取比、序列组成和长度推断基因型之前进行读取质量控制。我们还开发了一个基于性染色体特异性基因座扩增子的性别鉴定模块。为了允许更多的用户访问并补充自动评分,我们开发了 SatAnalyzer(微卫星分析器),这是一个用户友好的基于网络的平台,通过调用 Seq2Sat 进行基因型自动评分来进行读至报告分析,并生成交互式基因型图以进行手动编辑。SatAnalyzer 还允许用户通过分析跨基因座和样本的读取质量和分布来解决多重优化问题,以支持高质量文库制备。为了评估其性能,我们使用两个数据集将我们的工具包 Seq2Sat/SatAnalyzer 与传统毛细管凝胶方法和现有的微卫星基因分型软件 MEGASAT 进行了基准测试。结果表明,SatAnalyzer 可以实现 >99.70%的基因分型准确性,而 Seq2Sat 的速度比 MEGASAT 快约 5 倍,尽管生成了更多的信息表和图形。Seq2Sat 和 SatAnalyzer 可在 github(https://github.com/ecogenomicscanada/Seq2Sat)和 dockerhub(https://hub.docker.com/r/rocpengliu/satanalyzer)上免费获得。