Department of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, Medical Microbiology & Infection Prevention, Hanzeplein 1, Groningen, 9713 GZ, The Netherlands.
Department of Medical Microbiology and Infection Prevention, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
Eur J Nucl Med Mol Imaging. 2024 Nov;51(13):3924-3933. doi: 10.1007/s00259-024-06774-y. Epub 2024 Jun 21.
Prosthetic valve endocarditis (PVE) is a serious complication of prosthetic valve implantation, with an estimated yearly incidence of at least 0.4-1.0%. The Duke criteria and subsequent modifications have been developed as a diagnostic framework for infective endocarditis (IE) in clinical studies. However, their sensitivity and specificity are limited, especially for PVE. Furthermore, their most recent versions (ESC2015 and ESC2023) include advanced imaging modalities, e.g., cardiac CTA and [F]FDG PET/CT as major criteria. However, despite these significant changes, the weighing system using major and minor criteria has remained unchanged. This may have introduced bias to the diagnostic set of criteria. Here, we aimed to evaluate and improve the predictive value of the modified Duke/ESC 2015 (MDE2015) criteria by using machine learning algorithms.
In this proof-of-concept study, we used data of a well-defined retrospective multicentre cohort of 160 patients evaluated for suspected PVE. Four machine learning algorithms were compared to the prediction of the diagnosis according to the MDE2015 criteria: Lasso logistic regression, decision tree with gradient boosting (XGBoost), decision tree without gradient boosting, and a model combining predictions of these (ensemble learning). All models used the same features that also constitute the MDE2015 criteria. The final diagnosis of PVE, based on endocarditis team consensus using all available clinical information, including surgical findings whenever performed, and with at least 1 year follow up, was used as the composite gold standard.
The diagnostic performance of the MDE2015 criteria varied depending on how the category of 'possible' PVE cases were handled. Considering these cases as positive for PVE, sensitivity and specificity were 0.96 and 0.60, respectively. Whereas treating these cases as negative, sensitivity and specificity were 0.74 and 0.98, respectively. Combining the approaches of considering possible endocarditis as positive and as negative for ROC-analysis resulted in an excellent AUC of 0.917. For the machine learning models, the sensitivity and specificity were as follows: logistic regression, 0.92 and 0.85; XGBoost, 0.90 and 0.85; decision trees, 0.88 and 0.86; and ensemble learning, 0.91 and 0.85, respectively. The resulting AUCs were, in the same order: 0.938, 0.937, 0.930, and 0.941, respectively.
In this proof-of-concept study, machine learning algorithms achieved improved diagnostic performance compared to the major/minor weighing system as used in the MDE2015 criteria. Moreover, these models provide quantifiable certainty levels of the diagnosis, potentially enhancing interpretability for clinicians. Additionally, they allow for easy incorporation of new and/or refined criteria, such as the individual weight of advanced imaging modalities such as CTA or [F]FDG PET/CT. These promising preliminary findings warrant further studies for validation, ideally in a prospective cohort encompassing the full spectrum of patients with suspected IE.
人工瓣膜心内膜炎(PVE)是人工瓣膜植入后的严重并发症,估计每年的发病率至少为 0.4-1.0%。杜克标准及其后续修订版已被开发为临床研究中感染性心内膜炎(IE)的诊断框架。然而,它们的敏感性和特异性有限,尤其是对于 PVE。此外,其最新版本(ESC2015 和 ESC2023)包括心脏 CTA 和 [F]FDG PET/CT 等高级影像学作为主要标准。然而,尽管这些变化显著,但其使用主要和次要标准的加权系统保持不变。这可能会给诊断标准集带来偏差。在这里,我们旨在通过使用机器学习算法来评估和改进改良的杜克/ESC 2015 标准(MDE2015)的预测价值。
在这项概念验证研究中,我们使用了经过明确定义的回顾性多中心队列中 160 名疑似 PVE 患者的数据。比较了四种机器学习算法对 MDE2015 标准预测的诊断准确性:Lasso 逻辑回归、带有梯度提升的决策树(XGBoost)、不带梯度提升的决策树和组合这些模型的预测(集成学习)。所有模型均使用构成 MDE2015 标准的相同特征。最终的 PVE 诊断基于心脏病专家团队使用所有可用临床信息(包括进行的任何手术发现)达成的共识,并进行了至少 1 年的随访,作为复合金标准。
MDE2015 标准的诊断性能取决于如何处理“可能”PVE 病例的类别。将这些病例视为 PVE 阳性,敏感性和特异性分别为 0.96 和 0.60。而将这些病例视为阴性,敏感性和特异性分别为 0.74 和 0.98。将考虑可能的心内膜炎阳性和阴性的方法进行 ROC 分析相结合,得出的 AUC 为 0.917。对于机器学习模型,敏感性和特异性如下:逻辑回归,0.92 和 0.85;XGBoost,0.90 和 0.85;决策树,0.88 和 0.86;集成学习,0.91 和 0.85。相应的 AUC 分别为 0.938、0.937、0.930 和 0.941。
在这项概念验证研究中,与 MDE2015 标准中使用的主要/次要加权系统相比,机器学习算法提高了诊断性能。此外,这些模型提供了可量化的诊断确定性水平,可能增强了临床医生的可解释性。此外,它们允许轻松纳入新的和/或改进的标准,例如 CTA 或 [F]FDG PET/CT 等高级影像学的个体权重。这些有希望的初步发现需要进一步的研究来验证,理想情况下是在一个包含疑似 IE 所有患者的前瞻性队列中进行。