School of Chemical Engineering, University of Campinas, Campinas, Brazil; Departamento de Engenharia Química, Universidade Federal de São Paulo, Instituto de Ciências Ambientais, Químicas e Farmacêuticas, Brazil.
Hematology and Hemotherapy Center - University of Campinas/Hemocentro-Unicamp, Instituto Nacional de Ciência e Tecnologia do Sangue, Campinas, São Paulo, Brazil.
Int J Med Inform. 2020 Sep;141:104221. doi: 10.1016/j.ijmedinf.2020.104221. Epub 2020 Jun 18.
Recurrent venous thromboembolism (RVTE) is a multifactorial disease with occurrence rates which vary from 13 % to 25 % in 5 years after the initial event. Once a patient the first thrombotic event, the probability of recurrence should be determined, as well as the most adequate anticoagulant therapy. To our knowledge based on the published literature, three statistical models have been proposed to calculate RVTE probability. However, these models present several limitations, such as: limited input variables, lack of external validation and applicability only for patients with a first unprovoked thrombosis. Additionally, some of the models have been recognized to fail in determining RVTE when patients have a low risk of recurrence.
An alternative procedure in which three Artificial Neural Network (ANN) models were developed to classify which patients will present RVTE based solely on clinical data.
Data of 39 clinical factors from 235 patients were used to train several ANN structures. The difference among the three models was its inputs. In ANN 1, the inputs were all 39 factors. In ANN 2, 18 factors determined previously as the main predictors of RTVE using Principal Component Analysis (PCA). Finally, in ANN 3, 15 factors combining PCA results with practical aspects. Different number of hidden layers and neurons, and three optimization algorithms were considered. 5-fold cross validation was also performed.
The results showed that all models were capable of performing this task. Different optimization algorithms lead to different results. The best models presented high accuracy. The best structures were 39-10-10-1, 18-10-5-1, and 15-15-10-1 for ANN 1, ANN 2, and ANN 3 models, respectively. The cross-validation showed that the results are consistent.
This work showed that the association of multivariate techniques and ANNs is a powerful tool that can be used to model a complex phenomenon such as RVTE without the restrictions of existing approaches.
After proper validation, these ANN models can be used to help clinicians with decisions regarding VTE treatment.
复发性静脉血栓栓塞症(RVTE)是一种多因素疾病,首次事件后 5 年内的发生率为 13%至 25%。一旦患者发生首次血栓事件,就应确定复发的概率,并选择最合适的抗凝治疗方法。据我们所知,基于已发表的文献,已经提出了三种统计模型来计算 RVTE 的概率。然而,这些模型存在一些局限性,例如:输入变量有限、缺乏外部验证以及仅适用于首次无诱因血栓形成的患者。此外,一些模型已被证明在患者复发风险较低时无法确定 RVTE。
开发了三种人工神经网络(ANN)模型,通过仅基于临床数据对患者进行分类,以确定哪些患者将发生 RVTE。
使用来自 235 名患者的 39 个临床因素的数据来训练多个 ANN 结构。三个模型之间的区别在于其输入。在 ANN 1 中,输入是所有 39 个因素。在 ANN 2 中,使用主成分分析(PCA)确定了先前确定的 18 个 RVTE 的主要预测因素。最后,在 ANN 3 中,输入是结合 PCA 结果和实际方面的 15 个因素。考虑了不同数量的隐藏层和神经元,以及三种优化算法。还进行了 5 折交叉验证。
结果表明,所有模型都能够完成此任务。不同的优化算法会导致不同的结果。最佳模型具有较高的准确性。对于 ANN 1、ANN 2 和 ANN 3 模型,最佳结构分别为 39-10-10-1、18-10-5-1 和 15-15-10-1。交叉验证表明结果是一致的。
这项工作表明,多元技术和神经网络的结合是一种强大的工具,可以用于建模 RVTE 等复杂现象,而没有现有方法的限制。
经过适当验证后,这些 ANN 模型可用于帮助临床医生做出关于 VTE 治疗的决策。