Department of Forensic Genetics, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu, 610041, Sichuan, China.
Department of Immunology, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu, 610041, Sichuan, China.
Int J Legal Med. 2022 Sep;136(5):1211-1226. doi: 10.1007/s00414-022-02820-2. Epub 2022 Apr 9.
Microhaplotypes (MHs) are a promising new type of forensic markers that are defined by the combinations of two- or more single-nucleotide polymorphisms (SNPs) within 200 bp. Their advantages, such as low mutation rates, lack of stutter artifacts, and short amplicons, have improved human identification, kinship analysis, ancestry prediction, and mixture deconvolution capabilities. Information on published MHs, e.g., allele frequencies, is available in widely used public databases, ALlele FREquency Database, and MicroHapDB. However, there are abundant non-published MHs spread over the whole genome, and those databases do not incorporate other databases (e.g., the SNP Database) to provide users with more integrated information. Therefore, it is essential to establish a robust, responsive, and comprehensive MHs database. In this study, we thoroughly screened for SNP-SNP MHs among 26 populations from the 1000 Genomes Project (Phase 3). All genotype data of SNPs in each MH were converted to PHASE input files, and allele frequencies were estimated using PHASE. We compiled a detailed summary of SNP-SNPs at the global, continental, and population levels focused on haplotypes and the A value and supplemented our database using dbSNP data (last updated in 2015). We have successfully established a dual-SNP MH database (D-SNPsDB) of MHs within 50 bp for 26 populations in the integration of basic data such as physical positions in the human genome, mapping of variant identifiers (rsIDs), allele frequencies, and basic variant information. For public database queries, the D-SNPsDB web app was developed with the R Shiny package to get integrated information.
微单倍型 (Microhaplotypes, MHs) 是一种有前途的新型法医标记物,其定义为 200 bp 内两个或更多单核苷酸多态性 (Single Nucleotide Polymorphisms, SNPs) 的组合。它们的优势包括低突变率、缺乏短重复序列artifact 和短扩增子,提高了人类身份识别、亲缘关系分析、祖先预测和混合物解析能力。在广泛使用的公共数据库,如 ALlele FREquency Database 和 MicroHapDB 中,可获得有关已发表的微单倍型的信息,例如等位基因频率。然而,有大量未发表的微单倍型分布在整个基因组中,这些数据库并未整合其他数据库(例如 SNP Database),以向用户提供更综合的信息。因此,建立一个稳健、响应迅速和全面的微单倍型数据库至关重要。在这项研究中,我们彻底筛选了来自 1000 基因组计划(第三阶段)的 26 个人群中的 SNP-SNP 微单倍型。将每个微单倍型中 SNP 的所有基因型数据转换为 PHASE 输入文件,并使用 PHASE 估计等位基因频率。我们在全球、大陆和人群水平上,针对单倍型和 A 值,对 SNP-SNP 进行了详细的综合总结,并使用 dbSNP 数据(最后更新于 2015 年)补充了我们的数据库。我们成功地建立了一个双 SNP 微单倍型数据库 (D-SNPsDB),整合了基本数据,如人类基因组中的物理位置、变体标识符 (rsID) 的映射、等位基因频率和基本变体信息,涵盖了 26 个人群中 50 bp 内的微单倍型。为了进行公共数据库查询,我们使用 R Shiny 包开发了 D-SNPsDB web 应用程序,以获取综合信息。