Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yishan Road, Shanghai 200233, China; Otolaryngology Institute of Shanghai Jiao Tong University, Shanghai 200233, China; Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai 200233, China.
Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Hear Res. 2021 Aug;407:108281. doi: 10.1016/j.heares.2021.108281. Epub 2021 Jun 6.
The overall genetic profile for noise-induced hearing loss (NIHL) remains elusive. Herein we proposed a novel machine learning (ML) based strategy to evaluate individual susceptibility to NIHL and identify the underlying genetic risk variants based on a subsample of participants with extreme phenotypes.
Five features (age, sex, cumulative noise exposure [CNE], smoking, and alcohol drinking status) of 5,539 shipbuilding workers from large cross-sectional surveys were included in four ML classification models to predict their hearing levels. The area under the curve (AUC) and prediction accuracy were exploited to evaluate the performance of the models. Based on the prediction error of the ML models, the NIHL-susceptible group (n=150) and NIHL-resistant group (n=150) with a paradoxical relationship between hearing levels and features were separately screened, to identify the underlying variants associated with NIHL risk using whole-exome sequencing (WES). Subsequently, candidate risk variants were validated in an additional replication cohort (n=2108), followed by a meta-analysis.
With 10-fold cross-validation, the performances of the four ML models were robust and similar, with average AUCs and accuracies ranging from 0.783 to 0.798 and 73.7% to 73.8%, respectively. The phenotypes of the NIHL-susceptible and NIHL-resistant groups were significantly different (all p<0.001). After WES analysis and filtering, 12 risk variants contributing to NIHL susceptibility were identified and replicated. The meta-analyses showed that the A allele of CDH23 rs41281334 (odds ratio [OR]=1.506, 95% confidence interval [CI]=1.106-2.051) and the C allele of WHRN rs12339210 (OR=3.06, 95% CI=1.398-6.700) were significantly associated with increased risk of NIHL after adjustment for confounding factors.
This study revealed two genetic variants in CDH23 rs41281334 and WHRN rs12339210 that associated with NIHL risk, based on a promising approach for evaluating individual susceptibility using ML models.
噪声性听力损失(NIHL)的整体遗传特征仍然难以捉摸。在此,我们提出了一种新的基于机器学习(ML)的策略,以评估个体对 NIHL 的易感性,并根据具有极端表型的参与者亚组识别潜在的遗传风险变异。
将来自大型横断面调查的 5539 名造船工人的 5 个特征(年龄、性别、累积噪声暴露[CNE]、吸烟和饮酒状况)纳入四个 ML 分类模型,以预测他们的听力水平。利用曲线下面积(AUC)和预测准确性来评估模型的性能。基于 ML 模型的预测误差,分别筛选出听力水平和特征之间存在矛盾关系的 NIHL 易感组(n=150)和 NIHL 抵抗组(n=150),以使用全外显子组测序(WES)识别与 NIHL 风险相关的潜在变异。随后,在另外一个复制队列(n=2108)中验证候选风险变异,然后进行荟萃分析。
采用 10 折交叉验证,四个 ML 模型的性能稳健且相似,平均 AUC 和准确率范围分别为 0.783 至 0.798 和 73.7%至 73.8%。NIHL 易感组和 NIHL 抵抗组的表型差异显著(均 p<0.001)。经过 WES 分析和过滤,共鉴定出 12 个导致 NIHL 易感性的风险变异,并进行了复制。荟萃分析显示,CDH23 rs41281334 的 A 等位基因(比值比[OR]=1.506,95%置信区间[CI]=1.106-2.051)和 WHRN rs12339210 的 C 等位基因(OR=3.06,95%CI=1.398-6.700)在调整混杂因素后与 NIHL 风险显著相关。
本研究基于使用 ML 模型评估个体易感性的有前途的方法,揭示了 CDH23 rs41281334 和 WHRN rs12339210 中的两个遗传变异与 NIHL 风险相关。