Suppr超能文献

重新审视子宫内膜异位症的风险因素:一种机器学习方法。

Revisiting the Risk Factors for Endometriosis: A Machine Learning Approach.

作者信息

Blass Ido, Sahar Tali, Shraibman Adi, Ofer Dan, Rappoport Nadav, Linial Michal

机构信息

The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.

Alan Edwards Pain Management Unit, McGill University Health Centre, Montreal, QC H3G 1A4, Canada.

出版信息

J Pers Med. 2022 Jul 7;12(7):1114. doi: 10.3390/jpm12071114.

Abstract

Endometriosis is a condition characterized by implants of endometrial tissues into extrauterine sites, mostly within the pelvic peritoneum. The prevalence of endometriosis is under-diagnosed and is estimated to account for 5-10% of all women of reproductive age. The goal of this study was to develop a model for endometriosis based on the UK-biobank (UKB) and re-assess the contribution of known risk factors to endometriosis. We partitioned the data into those diagnosed with endometriosis (5924; ICD-10: N80) and a control group (142,723). We included over 1000 variables from the UKB covering personal information about female health, lifestyle, self-reported data, genetic variants, and medical history prior to endometriosis diagnosis. We applied machine learning algorithms to train an endometriosis prediction model. The optimal prediction was achieved with the gradient boosting algorithms of CatBoost for the data-combined model with an area under the ROC curve (ROC-AUC) of 0.81. The same results were obtained for women from a mixed ethnicity population of the UKB (7112; ICD-10: N80). We discovered that, prior to being diagnosed with endometriosis, affected women had significantly more ICD-10 diagnoses than the average unaffected woman. We used SHAP, an explainable AI tool, to estimate the marginal impact of a feature, given all other features. The informative features ranked by SHAP values included irritable bowel syndrome (IBS) and the length of the menstrual cycle. We conclude that the rich population-based retrospective data from the UKB are valuable for developing unified machine learning endometriosis models despite the limitations of missing data, noisy medical input, and participant age. The informative features of the model may improve clinical utility for endometriosis diagnosis.

摘要

子宫内膜异位症是一种以子宫内膜组织植入子宫外部位为特征的疾病,主要发生在盆腔腹膜内。子宫内膜异位症的患病率诊断不足,估计占所有育龄妇女的5-10%。本研究的目的是基于英国生物银行(UKB)开发一种子宫内膜异位症模型,并重新评估已知风险因素对子宫内膜异位症的影响。我们将数据分为诊断为子宫内膜异位症的患者(5924例;国际疾病分类第十版:N80)和对照组(142,723例)。我们纳入了来自UKB的1000多个变量,涵盖女性健康、生活方式、自我报告数据、基因变异以及子宫内膜异位症诊断前的病史等个人信息。我们应用机器学习算法训练子宫内膜异位症预测模型。对于数据合并模型,使用CatBoost的梯度提升算法实现了最佳预测,ROC曲线下面积(ROC-AUC)为0.81。对于来自UKB混合种族人群的女性(7112例;国际疾病分类第十版:N80)也得到了相同的结果。我们发现,在被诊断为子宫内膜异位症之前,受影响的女性比未受影响的女性平均有更多的国际疾病分类第十版诊断。我们使用可解释人工智能工具SHAP来估计给定所有其他特征时一个特征的边际影响。按SHAP值排序的信息性特征包括肠易激综合征(IBS)和月经周期长度。我们得出结论,尽管存在数据缺失、医疗输入有噪声和参与者年龄等局限性,但来自UKB的丰富的基于人群的回顾性数据对于开发统一的机器学习子宫内膜异位症模型很有价值。该模型的信息性特征可能会提高子宫内膜异位症诊断的临床实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7978/9317820/1d80672bfb30/jpm-12-01114-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验