College of Forensic Medicine, Hebei Key Laboratory of Forensic Medicine, Collaborative Innovation Center of Forensic Medical Molecular Identification, Research Unit of Digestive Tract Microecosystem Pharmacology and Toxicology, Hebei Medical University, Chinese Academy of Medical Sciences, Shijiazhuang, 050017, Hebei, China.
Physical Examination Center of Shijiazhuang People's Hospital, Shijiazhuang, 050011, Hebei, China.
Hum Genet. 2024 Mar;143(3):401-421. doi: 10.1007/s00439-024-02659-0. Epub 2024 Mar 20.
As a vital anthropometric characteristic, human height information not only helps to understand overall developmental status and genetic risk factors, but is also important for forensic DNA phenotyping. We utilized linear regression analysis to test the association between each CpG probe and the height phenotype. Next, we designed a methylation sequencing panel targeting 959 CpGs and subsequent height inference models were constructed for the Chinese population. A total of 11,730 height-associated sites were identified. By employing KPCA and deep neural networks, a prediction model was developed, of which the cross-validation RMSE, MAE and R were 5.62 cm, 4.45 cm and 0.64, respectively. Genetic factors could explain 39.4% of the methylation level variance of sites used in the height inference models. Collectively, we demonstrated an association between height and DNA methylation status through an EWAS analysis. Targeted methylation sequencing of only 959 CpGs combined with deep learning techniques could provide a model to estimate human height with higher accuracy than SNP-based prediction models.
作为一个重要的人体测量学特征,人类身高信息不仅有助于了解整体发育状况和遗传风险因素,而且对于法医 DNA 表型分析也很重要。我们利用线性回归分析来测试每个 CpG 探针与身高表型之间的关联。接下来,我们设计了一个针对 959 个 CpG 的甲基化测序面板,并为中国人群构建了后续的身高推断模型。总共鉴定出 11730 个与身高相关的位点。通过采用 KPCA 和深度神经网络,开发了一个预测模型,其交叉验证 RMSE、MAE 和 R 分别为 5.62cm、4.45cm 和 0.64。遗传因素可以解释身高推断模型中使用的位点的甲基化水平方差的 39.4%。总的来说,我们通过 EWAS 分析证明了身高与 DNA 甲基化状态之间的关联。仅针对 959 个 CpG 的靶向甲基化测序结合深度学习技术,可以提供一个比基于 SNP 的预测模型更准确地估计人类身高的模型。