Institute of Zoology and Biomedical Research, Faculty of Biology, Jagiellonian University, Gronostajowa st. 9, 30-387, Kraków, Poland; Malopolska Centre of Biotechnology, Jagiellonian University, Gronostajowa st. 7A, 30-387, Kraków, Poland.
Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beichen West Road 1-104, Chaoyang, Beijing, 100101, PR China; University of Chinese Academy of Sciences, 19 Yuquan Road, Shijingshan, Beijing, 100049, PR China.
Forensic Sci Int Genet. 2018 Nov;37:241-251. doi: 10.1016/j.fsigen.2018.08.017. Epub 2018 Aug 29.
Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.
人类的头发生长形状,通常分为直发、卷发、波浪发或卷曲发,是法医 DNA 表型分析和其他基于 DNA 的人类外貌预测应用的热门目标,例如在古遗传学中。最近,通过对 26964 名受试者进行的一系列全基因组关联和复制研究的结果,人们对头发形状变异的遗传知识有了进一步的了解,这些研究突出了 12 个位点,其中 8 个是新的,并引入了一个基于 14 个 SNP 的欧洲人预测模型。在本研究中,我们通过研究一组扩展的候选 SNP 预测因子,并使用独立的样本集进行模型验证,评估了基于 DNA 的头发生长形状预测的能力。在之前使用的 9674 名受试者(6068 名来自欧洲,2899 名来自亚洲,707 名具有欧洲和亚洲混合血统)中,通过考虑一个新的 90 个候选 SNP 列表来构建预测模型。为了进行模型验证,通过应用两种靶向大规模平行测序平台 Ion Torrent PGM 和 MiSeq 或 MassARRAY 平台,在 2415 名独立受试者(2138 名欧洲人和 277 名非欧洲人)中收集了新的基因型和表型数据。根据我们确定的对模型有显著贡献的 26 个遗传位点的 32 个 SNP,建立了一个二项式模型,用于预测直发与非直发。该模型在欧洲人中的预测准确率(AUC)为 0.664,在非欧洲人中为 0.789;这种统计学上的显著差异主要归因于非欧洲人中 EDAR SNP 的作用。考虑到性别和年龄,除了 SNP 之外,略微但无统计学意义地提高了预测准确率(AUC 分别为 0.680 和 0.800)。基于所研究的样本量和候选 DNA 标记,本研究提供了目前最可靠、最有效的 DNA 预测模型和 SNP 预测标记集,用于从 DNA 预测头发生长形状,朝着拓宽法医 DNA 表型分析超越色素沉着特征的方向迈出了下一步。