Hu Jianzhong, Lu Chen, Rogers Bob, Chandler Martin, Santos Jarren
Statistics and Data Science, American Thrombosis and Hemostasis Network, Rochester, USA.
Center for Digital Health Innovation, University of California at San Francisco, San Francisco, USA.
Cureus. 2024 Aug 13;16(8):e66810. doi: 10.7759/cureus.66810. eCollection 2024 Aug.
Background Artificial intelligence (AI) and machine learning (ML) are currently used in the clinical field to improve the outcome predictions on disease diagnosis and prognosis. However, to date, few AI/ML applications have been reported in rare diseases, such as hemophilia. In this study, taking advantage of the ATHNdataset, an extensive repository of hemostasis and thrombosis data, we aimed to demonstrate the application of AI/ML approaches to build predictive models to identify persons with hemophilia (PwH) who are at risk of poor outcome and to inform providers in clinical decision-making towards helping patients prevent long-term complications. Materials and methods This project was carried out in two steps. First, the data were mined from ATHN 7, a subset study of the ATHNdataset, to determine markers that defined "poor outcome." Second, we applied multiple AI/ML approaches on the ATHNdataset to validate our findings and to develop predictive models to identify PwH at risk of poor outcomes. The classical regression-based predictive model was used as a reference to evaluate the performance of various AI/ML models. Results Our models included features similarly distributed to response variables of interest, resulting in a limited ability to distinguish poor outcomes. Low recall (<53%) resulted in no single model reliably predicting poor outcomes out of all actual positive cases. Our results suggest that, to build a more useful AI/ML model, we may need a larger dataset size along with additional features. Furthermore, our results showed that most of the AI/ML models outperformed the classical logistic regression model in both model accuracy and precision. Conclusions Our AI and ML model showed limited ability to predict poor outcomes in people with hemophilia.
背景 人工智能(AI)和机器学习(ML)目前已应用于临床领域,以改善疾病诊断和预后的结果预测。然而,迄今为止,在血友病等罕见疾病中,鲜有AI/ML应用的报道。在本研究中,利用ATHN数据集(一个广泛的止血和血栓形成数据存储库),我们旨在展示AI/ML方法在构建预测模型中的应用,以识别有不良预后风险的血友病患者(PwH),并为临床决策中的医疗服务提供者提供信息,以帮助患者预防长期并发症。
材料和方法 本项目分两步进行。首先,从ATHN 7(ATHN数据集的一个子集研究)中挖掘数据,以确定定义“不良预后”的标志物。其次,我们在ATHN数据集上应用多种AI/ML方法来验证我们的发现,并开发预测模型以识别有不良预后风险的PwH。以经典的基于回归的预测模型作为参考,来评估各种AI/ML模型的性能。
结果 我们的模型所包含的特征与感兴趣的反应变量分布相似,导致区分不良预后的能力有限。召回率低(<53%)导致没有一个模型能可靠地预测所有实际阳性病例中的不良预后。我们的结果表明,要构建一个更有用的AI/ML模型,我们可能需要更大的数据集规模以及额外的特征。此外,我们的结果表明,大多数AI/ML模型在模型准确性和精确性方面均优于经典逻辑回归模型。
结论 我们的AI和ML模型在预测血友病患者不良预后方面能力有限。