Tai Chi-San, Ko Sung-Chu, Lee Chien-Chang, Yang Hui-Ru, Lin Chia-Ray, Choe Byung-Ho, Treepongkaruna Suporn, Chongsrisawat Voranush, Wu Chau-Chung, Chen Huey-Ling
Department of Pediatrics, National Taiwan University Children's Hospital, Taipei, Taiwan.
Center of Intelligent Healthcare, National Taiwan University Hospital, Taipei, Taiwan.
J Pediatr Gastroenterol Nutr. 2025 Oct;81(4):933-942. doi: 10.1002/jpn3.70166. Epub 2025 Jul 30.
Cholestasis in infancy poses a complex clinical conundrum for pediatric hepatologists, warranting timely diagnosis, especially for genetic diseases. This study aims to create machine learning (ML)-based prediction models, referred to as Jaundice Diagnosis Easy for Baby (JADE-B), to identify the subjects prone to genetic causes of cholestasis.
We retrieved patient data from the Integrated Medical Database at a university-affiliated tertiary medical center from 2006 to 2018. Patients with cholestatic disease were identified using liver-disease-specific International Classification of Diseases codes. A total of 47 clinical and laboratory parameters were used for ML for predicting a positive genetic disease, defined by a disease-specific genetic diagnosis matched with phenotype. Four distinct classifiers: Logistic regression, XGBoost (XGB), LightGBM (LGBM), and Random Forests were utilized to build the models.
From a patient pool of 1845, 1008 infants below 1 year of age diagnosed with cholestatic liver disease were included in the analysis. A comprehensive set of 47 pertinent clinical and laboratory features was incorporated for training the ML models. We built five sets of models (Model 1-5), yielding an area under the receiver operating characteristic curve of 0.869, 0.884, 0.855, 0.852, and 0.836, respectively. A JADE-B model was built using 20 simple and widely accessible clinical parameters at disease onset, up to 1 month, to predict patients with genetic disorders.
The machine learning model prioritizes cholestatic infants for the allocation of genetic diagnostic tools and patient referrals, as well as optimizes the utilization of genetic diagnostic resources.
婴儿胆汁淤积症给儿科肝病学家带来了复杂的临床难题,需要及时诊断,尤其是对于遗传疾病。本研究旨在创建基于机器学习(ML)的预测模型,即婴儿黄疸简易诊断模型(JADE - B),以识别易患胆汁淤积症遗传病因的患者。
我们从一所大学附属三级医疗中心的综合医学数据库中检索了2006年至2018年的患者数据。使用特定肝病的国际疾病分类代码识别胆汁淤积性疾病患者。总共47个临床和实验室参数用于机器学习,以预测由与表型匹配的疾病特异性基因诊断定义的阳性遗传疾病。利用四种不同的分类器:逻辑回归、XGBoost(XGB)、LightGBM(LGBM)和随机森林来构建模型。
在1845名患者中,1008名1岁以下诊断为胆汁淤积性肝病的婴儿被纳入分析。纳入了一套全面的47个相关临床和实验室特征用于训练机器学习模型。我们构建了五组模型(模型1 - 5),其受试者操作特征曲线下面积分别为0.869、0.884、0.855、0.852和0.836。使用疾病发作时(发病后1个月内)20个简单且易于获取的临床参数构建了JADE - B模型,以预测患有遗传疾病的患者。
机器学习模型优先为胆汁淤积性婴儿分配基因诊断工具和患者转诊,并优化基因诊断资源的利用。