Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand.
Department of Forensic Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand.
PLoS One. 2023 Jul 17;18(7):e0282551. doi: 10.1371/journal.pone.0282551. eCollection 2023.
Short tandem repeats (STRs) are short repeated sequences commonly found in the human genome and valuable in forensic science, used for human identity and relatedness markers. Next-generation sequencing (NGS) technologies, e.g., ForenSeq Signature Prep, can sequence STRs, inferring length-based alleles and single nucleotide polymorphisms (SNPs) and providing valuable insights into population and sub-population structures. Despite the potential benefits of NGS for STRs, no open-source software platform integrates the collection, management, and analysis of STR data from NGS into one place. Users must use multiple programs to process their STR data and then collect the results into a separate database or a file system folder. Moreover, analyzing repeat structures (STR repeat motifs) may require learning multiple software tools, making the process inefficient and cumbersome. To address this gap, we introduce the STRategy, a standalone web-based application supporting essential STR data management and analysis capabilities. The STRategy allows users to collect their data into its database, automatically calculates forensic parameters, and visualizes the analyzed data in various forms. Users can search the database using different options, such as by profile, loci, and genotypes, with and without a specific test kit. Moreover, users can also find the nucleotide variants of a locus among the samples. We designed the STRategy for internal use in a laboratory or an organization. Hence, our system includes role-based access control that allows users to search for or access specific data based on their responsibilities. The administrator role can customize the system, for example, configure maps according to the samples' geographic data, and manage reference STR repeat motifs. A laboratory or an organization can download and install a copy of STRategy on their local system using Docker, as described in https://github.com/cucpbioinfo/STRategy. In summary, the STRategy is an end-to-end system that provides users with a database to collect the analyzed STR data from NGS, the dynamic analyses of forensic parameters, and the variants of STR patterns according to the newly added samples, which are then explorable via various search options and visualizations. The system is helpful for both forensic investigations and forensic genetics.
短串联重复序列(STRs)是人类基因组中常见的短重复序列,在法医学中非常有价值,可用于人类身份和关联性标记。下一代测序(NGS)技术,例如 ForenSeq Signature Prep,可以对 STR 进行测序,推断基于长度的等位基因和单核苷酸多态性(SNP),并为人群和亚人群结构提供有价值的见解。尽管 NGS 对 STR 具有潜在的好处,但没有开源软件平台将 NGS 中 STR 数据的收集、管理和分析整合到一个地方。用户必须使用多个程序来处理他们的 STR 数据,然后将结果收集到单独的数据库或文件系统文件夹中。此外,分析重复结构(STR 重复基序)可能需要学习多个软件工具,这使得该过程效率低下且繁琐。为了解决这个差距,我们引入了 STRategy,这是一个独立的基于网络的应用程序,支持基本的 STR 数据管理和分析功能。STRategy 允许用户将其数据收集到其数据库中,自动计算法医学参数,并以各种形式可视化分析后的数据。用户可以使用不同的选项,如通过配置文件、基因座和基因型,在有或没有特定测试试剂盒的情况下,在数据库中进行搜索。此外,用户还可以在样本中找到基因座的核苷酸变体。我们为内部实验室或组织设计了 STRategy。因此,我们的系统包括基于角色的访问控制,允许用户根据其职责搜索或访问特定数据。管理员角色可以根据需要自定义系统,例如,根据样本的地理数据配置地图,并管理参考 STR 重复基序。实验室或组织可以按照 https://github.com/cucpbioinfo/STRategy 中的说明,使用 Docker 在其本地系统上下载并安装 STRategy 的副本。总之,STRategy 是一个端到端的系统,为用户提供了一个数据库,用于从 NGS 收集分析后的 STR 数据、动态分析法医学参数以及根据新添加的样本分析 STR 模式的变体,然后可以通过各种搜索选项和可视化进行探索。该系统对法医学调查和法医学遗传学都有帮助。