Ibrahim Omar, Sutherland Heidi G, Lea Rodney A, Nasrallah Fatima, Maksemous Neven, Smith Robert A, Haupt Larisa M, Griffiths Lyn R
Genomics Research Centre, Centre for Genomics and Personalised Health, School of Biomedical Sciences, Queensland University of Technology, 60 Musk Ave, Kelvin Grove, QLD, 4059, Australia.
The Queensland Brain Institute, The University of Queensland, Brisbane, Australia.
J Mol Med (Berl). 2022 Feb;100(2):303-312. doi: 10.1007/s00109-021-02158-z. Epub 2021 Nov 19.
A percentage of the population suffers prolonged and persistent post-concussion symptoms (PCS) following average head injuries or develops severe neurological dysfunction following minor head trauma. Genetic variants that may contribute to individual response to head trauma have been investigated in some studies, but to date none have explored the use of machine learning (ML) methods with genomic data to specifically explore outcomes of head trauma. Whole exome sequencing (WES) was completed for three groups of individuals (N = 60): (a) 16 individuals with severe neurological responses to minor head trauma, (b) 26 individuals with persistent PCS and (c) 18 individuals with normal recovery from concussion or mTBI. Gradient boosted tree algorithms were applied to the data using XGBoost. By using variants with CADD scores above 15 in the training set (randomly sampled 70%), we identified signatures that accurately distinguish to accurately distinguish the test groups with an average area under the curve (AUC) of 0.8 (SE = 0.019). Metrics including positive and negative prediction values, as well as kappa were all within acceptable range to support the prediction accuracy. This study illustrates how ML methods in combination with WES data have the potential to predict severe or prolonged responses to head trauma from healthy recovery. KEY MESSAGES: Linear association analysis has been inconclusive in concussion genetics. Non-linear methods as boosted trees can offer better insights in small samples. Strong discrimination trends can be achieved from exome data of cases and controls.
一定比例的人群在遭受普通头部损伤后会出现长期持续的脑震荡后症状(PCS),或者在轻微头部创伤后出现严重的神经功能障碍。一些研究已经对可能导致个体对头部创伤反应的基因变异进行了调查,但迄今为止,还没有人探索使用机器学习(ML)方法结合基因组数据来专门研究头部创伤的结果。对三组个体(N = 60)完成了全外显子组测序(WES):(a)16名对轻微头部创伤有严重神经反应的个体,(b)26名患有持续性PCS的个体,以及(c)18名脑震荡或轻度创伤性脑损伤(mTBI)后恢复正常的个体。使用XGBoost将梯度提升树算法应用于数据。通过在训练集(随机抽取70%)中使用综合注释依赖性损耗(CADD)评分高于15的变异,我们识别出了能够准确区分测试组的特征,曲线下平均面积(AUC)为0.8(标准误 = 0.019)。包括阳性和阴性预测值以及kappa在内的指标均在可接受范围内,以支持预测准确性。这项研究说明了ML方法与WES数据相结合如何有可能从健康恢复中预测对头部创伤的严重或长期反应。关键信息:线性关联分析在脑震荡遗传学中尚无定论。作为提升树的非线性方法可以在小样本中提供更好的见解。从病例和对照的外显子组数据中可以实现强烈的区分趋势。