Ren Yingchun, Shan Xiaoying, Ding Gengchao, Ai Ling, Zhu Weiying, Ding Ying, Yu Fuzhou, Chen Yun, Wu Beijiao
College of Data Science, Jiaxing University, Jiaxing, Zhejiang, 314001, China.
College of Information Science and Engineering, Jiaxing University, Jiaxing, Zhejiang, 314001, China.
BMC Pregnancy Childbirth. 2025 Jan 30;25(1):89. doi: 10.1186/s12884-025-07180-4.
Intrahepatic cholestasis of pregnancy (ICP) is a liver disorder that occurs in the second and third trimesters of pregnancy and is associated with a significant risk of fetal complications, including premature birth and fetal death. In clinical practice, the diagnosis of ICP is predominantly based on the presence of pruritus in pregnant women and elevated serum total bile acid. However, this approach may result in missed or delayed diagnoses. Therefore, it is essential to explore the risk factors associated with ICP and to accurately identify affected individuals to enable timely prophylactic interventions. The existing literature exhibits a paucity of studies employing artificial intelligence to predict ICP. Therefore, developing machine learning-based diagnostic and severity classification models for ICP holds significant importance.
This study included ICP patients and some healthy pregnant women from Jiaxing Maternity and Child Health Care Hospital in China between July 2020 and October 2023. We collected clinical data during their pregnancies and selected the top 11 critical risk factors through univariable and lasso regression analysis. The dataset was randomly divided into training and testing cohorts. Thirteen machine learning techniques, including Random Forest, Support Vector Machine, and Artificial Neural Network, were employed. Based on their various classification performances on the training set, the top five models were selected for internal validation.
The dataset included 798 participants (300 normal, 312 mild, and 186 severe cases). Through univariable and lasso regression analysis, total bile acid, gamma-glutamyl transferase, multiple pregnancy, lymphocyte percentage, hematocrit, neutrophil percentage, prothrombin time, Aspartate aminotransferase, red blood cell count, lymphocyte count and platelet count were identified as risk factors of ICP. The AUCs of the selected top five models ranged from 0.9509 to 0.9614. The CatBoost model achieved the best performance, with an AUC of 0.9614 (95% confidence interval, 0.9377-0.9813), an accuracy of 0.9085, a precision of 0.8930, a recall of 0.9059, and a F1-score of 0.8981.
We identified risk factors for ICP and developed machine learning models based on these factors. These models demonstrated good performance and can be used to help predict whether pregnant women have ICP and the degree of ICP (mild or severe).
妊娠期肝内胆汁淤积症(ICP)是一种发生于妊娠中晚期的肝脏疾病,与包括早产和胎儿死亡在内的胎儿并发症的重大风险相关。在临床实践中,ICP的诊断主要基于孕妇瘙痒的存在以及血清总胆汁酸升高。然而,这种方法可能导致漏诊或误诊。因此,探索与ICP相关的风险因素并准确识别受影响个体以进行及时的预防性干预至关重要。现有文献中利用人工智能预测ICP的研究较少。因此,开发基于机器学习的ICP诊断和严重程度分类模型具有重要意义。
本研究纳入了2020年7月至2023年10月期间来自中国嘉兴市妇幼保健院的ICP患者和一些健康孕妇。我们收集了她们孕期的临床数据,并通过单变量和套索回归分析选择了前11个关键风险因素。数据集被随机分为训练集和测试集。采用了包括随机森林、支持向量机和人工神经网络在内的13种机器学习技术。根据它们在训练集上的各种分类性能,选择了前五个模型进行内部验证。
数据集包括798名参与者(300例正常、312例轻度和186例重度病例)。通过单变量和套索回归分析,总胆汁酸、γ-谷氨酰转移酶、多胎妊娠、淋巴细胞百分比、血细胞比容、中性粒细胞百分比、凝血酶原时间、天冬氨酸转氨酶、红细胞计数、淋巴细胞计数和血小板计数被确定为ICP的风险因素。所选前五个模型的AUC范围为0.9509至0.9614。CatBoost模型表现最佳,AUC为0.9614(95%置信区间,0.9377 - 0.9813),准确率为0.9085,精确率为0.8930,召回率为0.9059,F1分数为0.8981。
我们确定了ICP的风险因素,并基于这些因素开发了机器学习模型。这些模型表现良好,可用于帮助预测孕妇是否患有ICP以及ICP的程度(轻度或重度)。