Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan; Department of Medical Physiology, College of Medicine, University of Nahdlatul Ulama Surabaya, Surabaya 60237, Indonesia.
Department of Medical Physiology, College of Medicine, University of Nahdlatul Ulama Surabaya, Surabaya 60237, Indonesia; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei 11031, Taiwan.
EBioMedicine. 2020 Apr;54:102710. doi: 10.1016/j.ebiom.2020.102710. Epub 2020 Apr 10.
We developed and validated an artificial intelligence (AI)-assisted prediction of preeclampsia applied to a nationwide health insurance dataset in Indonesia.
The BPJS Kesehatan dataset have been preprocessed using a nested case-control design into preeclampsia/eclampsia (n = 3318) and normotensive pregnant women (n = 19,883) from all women with one pregnancy. The dataset provided 95 features consisting of demographic variables and medical histories started from 24 months to event and ended by delivery as the event. Six algorithms were compared by area under the receiver operating characteristics curve (AUROC) with a subgroup analysis by time to the event. We compared our model to similar prediction models from systematically reviewed studies. In addition, we conducted a text mining analysis based on natural language processing techniques to interpret our modeling results.
The best model consisted of 17 predictors extracted by a random forest algorithm. Nine∼12 months to the event was the period that had the best AUROC in external validation by either geographical (0.88, 95% confidence interval (CI) 0.88-0.89) or temporal split (0.86, 95% CI 0.85-0.86). We compared this model to prediction models in seven studies from 869 records in PUBMED, EMBASE, and SCOPUS. This model outperformed the previous models in terms of the precision, sensitivity, and specificity in all validation sets.
Our low-cost model improved preliminary prediction to decide pregnant women that will be predicted by the models with high specificity and advanced predictors.
This work was supported by grant no. MOST108-2221-E-038-018 from the Ministry of Science and Technology of Taiwan.
我们开发并验证了一种人工智能(AI)辅助预测子痫前期的方法,该方法应用于印度尼西亚的全国健康保险数据集。
使用嵌套病例对照设计,对 BPJS Kesehatan 数据集进行预处理,将子痫前期/子痫(n=3318)和正常血压孕妇(n=19883)分为一组。数据集提供了 95 个特征,包括从 24 个月到事件的人口统计学变量和病史,以及以分娩为终点的特征。我们比较了 6 种算法的受试者工作特征曲线下面积(AUROC),并进行了亚组分析。我们将我们的模型与系统综述研究中的类似预测模型进行了比较。此外,我们还进行了基于自然语言处理技术的文本挖掘分析,以解释我们的建模结果。
最好的模型由随机森林算法提取的 17 个预测因子组成。在外部验证中,无论是地理上(0.88,95%置信区间[CI] 0.88-0.89)还是时间上(0.86,95%CI 0.85-0.86),距离事件发生 9∼12 个月时的 AUROC 最佳。我们将这个模型与来自 PUBMED、EMBASE 和 SCOPUS 的 7 项研究中的 869 个记录中的预测模型进行了比较。在所有验证集中,该模型在精度、敏感性和特异性方面均优于之前的模型。
我们的低成本模型提高了初步预测能力,以便能够使用特异性和高级预测因子更高的模型对孕妇进行预测。
本研究得到台湾科技部 MOST108-2221-E-038-018 号项目的支持。