Ge Jianye, Budowle Bruce, Cariaso Michael, Mittelman Kristen, Mittelman David
Othram Inc., The Woodlands, TX, United States.
Department of Forensic Medicine, University of Helsinki, Helsinki, Finland.
Front Genet. 2025 Jul 23;16:1635734. doi: 10.3389/fgene.2025.1635734. eCollection 2025.
Forensic genetic genealogy (FGG) is a force-multiplier for human identification, leveraging dense single nucleotide polymorphism (SNP) data to infer relationships through identity by descent (IBD) segment analysis. Although powerful for investigative lead generation, broad adoption of SNP-based identification methods by the forensic community, especially medical examiners and crime laboratories, necessitates likelihood ratio (LR)-based relationship testing, to align with traditional kinship testing standards. To address this gap, a novel method was developed that incorporates LR calculations into FGG and SNP testing workflows. This approach is unique in that it dynamically selects unlinked, highly informative SNPs based on configurable thresholds for minor allele frequency (MAF) and minimum genetic distance for a robust and reliable analysis. Employing a curated panel of 222,366 SNPs from gnomAD v4 and data from the 1,000 genomes project, high accuracy in resolving relationships up to second-degree relatives can be achieved. For example, a subset of 126 SNPs (MAF > 0.4, minimum genetic distance of 30 cM) yielded 96.8% accuracy and a weighted F1 score of 0.975 across 2,244 tested pairs. This LR-based methodology enables forensic laboratories to select informative SNPs and integrate modern genomic data with existing accredited relationship testing frameworks, providing critical statistical support for close-relationship comparisons and enhances the rigor of FGG- and SNP-based human identification applications.
法医基因族谱学(FGG)是一种用于人类身份识别的力量倍增器,它利用密集的单核苷酸多态性(SNP)数据,通过同源片段(IBD)分析来推断亲属关系。尽管对于生成调查线索很强大,但法医界,尤其是法医和犯罪实验室广泛采用基于SNP的识别方法,需要基于似然比(LR)的亲属关系测试,以符合传统的亲属关系测试标准。为了填补这一空白,开发了一种新方法,将LR计算纳入FGG和SNP测试工作流程。这种方法的独特之处在于,它根据次要等位基因频率(MAF)的可配置阈值和最小遗传距离动态选择不连锁、信息丰富的SNP,以进行稳健可靠的分析。使用来自gnomAD v4的222,366个SNP的精选面板和1000基因组计划的数据,可以在解析二级亲属以内的关系方面实现高精度。例如,126个SNP的子集(MAF>0.4,最小遗传距离为30 cM)在2244对测试对中产生了96.8%的准确率和0.975的加权F1分数。这种基于LR的方法使法医实验室能够选择信息丰富的SNP,并将现代基因组数据与现有的认可亲属关系测试框架相结合,为近亲关系比较提供关键的统计支持,并提高基于FGG和SNP的人类身份识别应用的严谨性。