Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430030, PR China; Hubei Key Laboratory of the Forensic Science, Hubei University of Police, Wuhan, Hubei 430035, PR China.
Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430030, PR China.
Forensic Sci Int Genet. 2023 Nov;67:102941. doi: 10.1016/j.fsigen.2023.102941. Epub 2023 Oct 5.
Accurate age estimation from semen has the potential to greatly narrow the pool of unidentified suspects in sexual assault investigations. However, previous efforts utilizing semen age-related CpG (AR-CpG) markers have shown lower accuracy compared to blood AR-CpG-based methods. This discrepancy may be attributed to DNA methylation (DNAm) interferences from "round cells" such as leukocytes and immature sperm cells in semen. This study aimed to develop age calculators based on sperm-specific AR-CpG markers and to achieve performance-improved age estimates from sperm DNA. Through an analysis of publicly available MethylationEPIC microarray data from 90 sperm samples of healthy males aged 22-51 years, we identified 31 sperm-specific AR-CpG markers with absolute Pearson's R values > 0.5 and Benjamini-Hochberg adjusted p values < 0.013. The top 19 AR-CpG markers with the largest absolute R values and beta ranges > 0.10, along with 3 reported semen AR-CpG markers (cg06304190, cg06979108, and cg12837463), were integrated into two methylation SNaPshot panels (Ⅰ and Ⅱ), each containing 11 markers. The 21 qualified AR-CpG markers showed absolute R values ≥ 0.427 in an independent validation cohort of 253 sperm DNA samples (22-67 years), with cg21843517 exhibiting the strongest age correlation (R = 0.853). The optimal models, constructed using sperm DNAm data of the training set (n = 214, 22-67 years) and markers from panel Ⅰ (n = 11), panel Ⅱ (n = 10), or both panels, achieved mean absolute errors (MAEs) of 2.526-4.746, 3.890-5.715, and > 9.800 years on the test sets of sperm (n = 39, 23-64 years), semen (same donors as the sperm test set), and whole blood (n = 40, 22-65 years), respectively. The simplified models incorporating 3, 5, 9, or 14 AR-CpG markers (MAE = 2.918-4.139 years for sperm) still outperformed the Lee et al. original model (MAE = 6.444 years for semen) and the reconstructed panel Lee model (MAE = 6.011 years for sperm). The final models, utilizing all sperm DNAm data (n = 253) and markers from panel Ⅰ, panel Ⅱ, or both panels, yielded mean MAEs of 2.587, 2.766, and 2.200 years, respectively, on the 50 test sets generated by 5 repeats of 10-fold cross-validations. Additionally, multiple markers in both panels demonstrated the ability to discern sperm or semen from blood with 100% accuracy. In summary, our study substantiates the potential of sperm-specific AR-CpG markers for precise age estimation from sperm DNA, providing an improved toolset for forensic investigations.
准确地从精液中推断年龄,有可能大大缩小性侵犯调查中未识别嫌疑人的范围。然而,之前利用与精液年龄相关的 CpG(AR-CpG)标记物的研究表明,其准确性低于基于血液 AR-CpG 的方法。这种差异可能归因于精液中“圆形细胞”(如白细胞和未成熟精子)的 DNA 甲基化(DNAm)干扰。本研究旨在开发基于精子特异性 AR-CpG 标记物的计算器,并从精子 DNA 中实现性能改进的年龄估计。通过对 90 个年龄在 22-51 岁的健康男性的精液样本进行公开的 MethylationEPIC 微阵列数据分析,我们确定了 31 个具有绝对 Pearson's R 值>0.5 和 Benjamini-Hochberg 调整 p 值<0.013 的精子特异性 AR-CpG 标记物。具有最大绝对 R 值和 beta 范围>0.10 的前 19 个 AR-CpG 标记物,以及 3 个报告的精液 AR-CpG 标记物(cg06304190、cg06979108 和 cg12837463),被整合到两个甲基化 SNaPshot 面板(Ⅰ和Ⅱ)中,每个面板包含 11 个标记物。在 253 个精子 DNA 样本(22-67 岁)的独立验证队列中,21 个合格的 AR-CpG 标记物显示出绝对 R 值≥0.427,其中 cg21843517 与年龄相关性最强(R=0.853)。使用训练集(n=214,22-67 岁)的精子 DNAm 数据和面板 I(n=11)、面板 II(n=10)或两个面板中的标记物构建的最佳模型,在精子测试集(n=39,23-64 岁)、精液(与精子测试集相同的供体)和全血(n=40,22-65 岁)测试集中分别实现了 2.526-4.746、3.890-5.715 和>9.800 年的平均绝对误差(MAE)。包含 3、5、9 或 14 个 AR-CpG 标记物的简化模型(精子的 MAE=2.918-4.139 年)仍然优于 Lee 等人的原始模型(精液的 MAE=6.444 年)和重建的面板 Lee 模型(精子的 MAE=6.011 年)。最终模型,利用所有精子 DNAm 数据(n=253)和面板 I、面板 II 或两个面板中的标记物,在 5 次 10 倍交叉验证重复生成的 50 个测试集中,分别产生 2.587、2.766 和 2.200 年的平均 MAE。此外,两个面板中的多个标记物均具有 100%的准确率来区分精子或精液与血液。总之,我们的研究证实了精子特异性 AR-CpG 标记物在精子 DNA 中进行精确年龄推断的潜力,为法医调查提供了改进的工具集。