van Linschoten Reinier C A, van Leeuwen Nikki, van Klaveren David, Pierik Marieke J, Creemers Rob, Hendrix Evelien M B, Hazelzet Jan A, van der Woude C Janneke, West Rachel L, van Noord Desirée
Department of Gastroenterology & Hepatology, Franciscus Gasthuis & Vlietland, P.O. Box 10900, 3004 BA, Rotterdam, The Netherlands.
Department of Gastroenterology & Hepatology, Erasmus MC, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands.
J Crohns Colitis. 2025 Feb 4;19(2). doi: 10.1093/ecco-jcc/jjaf017.
Large registries are promising tools to study the epidemiology of inflammatory bowel disease (IBD). We aimed to develop and validate machine learning models to identify IBD cases in administrative data, aiming to determine the prevalence, incidence, and mortality of IBD in the Netherlands.
We developed machine learning models for administrative data to identify IBD cases and classify them on subtype and incidence year. Models were developed in a population-based cohort and externally validated in a hospital cohort. Models were evaluated on Brier score, area under the receiver operating characteristic curve (AUC), calibration, and accuracy. The best models were used to determine the epidemiology of IBD in the Netherlands between 2013 and 2020.
For identifying IBD cases the random forest model was best (AUC: 0.97, 95% CI [0.96; 0.97]). The gradient-boosted trees model for subtype was best (accuracy: 0.95, 95% CI [0.94; 0.95]) as was the random forest model for incidence year (0.88, 95% CI [0.86; 0.89]). The prevalence of IBD in the Netherlands was 577.6 (95% CI [566.7; 586.2]) per 100 000 on December 31, 2020, with varying prevalence across the Netherlands. Incidence of IBD was 20.1 (95% CI [18.0; 22.3]) per 100 000 in 2020 and stable over time. Mortality rates of IBD patients rose over time and were 11.6 (95% CI [10.5; 11.8]) per 1000 in 2020 as compared to 9.5 in the general population.
Inflammatory bowel disease cases can be accurately identified using administrative data. The prevalence of IBD in the Netherlands is increasing slower than expected, suggesting a trend towards the epidemiological stage of Prevalence Equilibrium.
大型登记系统是研究炎症性肠病(IBD)流行病学的有前景的工具。我们旨在开发并验证机器学习模型,以在行政数据中识别IBD病例,从而确定荷兰IBD的患病率、发病率和死亡率。
我们为行政数据开发了机器学习模型,以识别IBD病例并对其进行亚型和发病年份分类。模型在基于人群的队列中开发,并在医院队列中进行外部验证。通过Brier评分、受试者工作特征曲线下面积(AUC)、校准和准确性对模型进行评估。使用最佳模型确定2013年至2020年荷兰IBD的流行病学情况。
对于识别IBD病例,随机森林模型最佳(AUC:0.97,95%CI[0.96;0.97])。亚型的梯度提升树模型最佳(准确性:0.95,95%CI[0.94;0.95]),发病年份的随机森林模型也是如此(0.88,95%CI[0.86;0.89])。2020年12月31日,荷兰IBD的患病率为每10万人577.6(95%CI[566.7;586.2]),荷兰各地患病率有所不同。2020年IBD的发病率为每10万人20.1(95%CI[18.0;22.3]),且随时间保持稳定。IBD患者的死亡率随时间上升,2020年为每1000人11.6(95%CI[10.5;11.8]),而普通人群为9.5。
使用行政数据可以准确识别炎症性肠病病例。荷兰IBD的患病率增长速度低于预期,表明呈现向患病率平衡的流行病学阶段发展的趋势。