Pan Liyan, Liu Guangjian, Mao Xiaojian, Li Huixian, Zhang Jiexin, Liang Huiying, Li Xiuzhen
Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China.
Department of Genetics and Endocrinology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China.
JMIR Med Inform. 2019 Feb 12;7(1):e11728. doi: 10.2196/11728.
Central precocious puberty (CPP) in girls seriously affects their physical and mental development in childhood. The method of diagnosis-gonadotropin-releasing hormone (GnRH)-stimulation test or GnRH analogue (GnRHa)-stimulation test-is expensive and makes patients uncomfortable due to the need for repeated blood sampling.
We aimed to combine multiple CPP-related features and construct machine learning models to predict response to the GnRHa-stimulation test.
In this retrospective study, we analyzed clinical and laboratory data of 1757 girls who underwent a GnRHa test in order to develop XGBoost and random forest classifiers for prediction of response to the GnRHa test. The local interpretable model-agnostic explanations (LIME) algorithm was used with the black-box classifiers to increase their interpretability. We measured sensitivity, specificity, and area under receiver operating characteristic (AUC) of the models.
Both the XGBoost and random forest models achieved good performance in distinguishing between positive and negative responses, with the AUC ranging from 0.88 to 0.90, sensitivity ranging from 77.91% to 77.94%, and specificity ranging from 84.32% to 87.66%. Basal serum luteinizing hormone, follicle-stimulating hormone, and insulin-like growth factor-I levels were found to be the three most important factors. In the interpretable models of LIME, the abovementioned variables made high contributions to the prediction probability.
The prediction models we developed can help diagnose CPP and may be used as a prescreening tool before the GnRHa-stimulation test.
女孩中枢性性早熟(CPP)严重影响其儿童期的身心发育。诊断方法——促性腺激素释放激素(GnRH)刺激试验或GnRH类似物(GnRHa)刺激试验——费用高昂,且由于需要反复采血,会让患者感到不适。
我们旨在结合多个与CPP相关的特征,构建机器学习模型来预测对GnRHa刺激试验的反应。
在这项回顾性研究中,我们分析了1757名接受GnRHa试验的女孩的临床和实验室数据,以开发用于预测对GnRHa试验反应的XGBoost和随机森林分类器。局部可解释模型无关解释(LIME)算法与黑箱分类器一起使用,以提高其可解释性。我们测量了模型的敏感性、特异性和受试者操作特征曲线下面积(AUC)。
XGBoost和随机森林模型在区分阳性和阴性反应方面均表现良好,AUC范围为0.88至0.90,敏感性范围为77.91%至77.94%,特异性范围为84.32%至87.66%。基础血清促黄体生成素、促卵泡生成素和胰岛素样生长因子-I水平被发现是三个最重要的因素。在LIME的可解释模型中,上述变量对预测概率有很大贡献。
我们开发的预测模型可有助于诊断CPP,并可作为GnRHa刺激试验前的预筛查工具。