Taheri Soodejani Moslem, Tabatabaei Seyyed Mohammad, Mahmoudimanesh Marzieh
Center for Healthcare Data Modeling, Department of Biostatistics and Epidemiology, School of Public Health, Shahid Sadoughi University of Medical Sciences Yazd, Iran.
Medical Informatics Department, School of Medicine, Mashhad University of Medical Sciences Mashhad, Iran.
Am J Cardiovasc Dis. 2021 Aug 15;11(4):484-488. eCollection 2021.
Heart disease is the leading cause of death in the world and 17 million people die from cardiovascular diseases around the world each year, so finding factors that affect the survival of these patients is of particular importance. Therefore, finding the best model to analyze patient survival can help to find more accurate results.
There are different methods to survival analysis that assess one or more risk factors; the classic Kaplan-Meier method, Cox regression, parametric survival models, and newer models such as Bayesian survival. Cox regression is most common and is generally used for time-dependent data, and the main difference between cox regression and Bayesian models is that the prior distribution in Bayesian models can affect the values of the parameters. Some survival analysis models have certain conditions that need to be considered before analyzing the data. In this paper, we use a dataset from Kaggle and discuss these conditions. This dataset contains medical records of 299 patients with heart failure collected at the Faisalabad Institute of Cardiology and the Allied Hospital in Faisalabad (Punjab, Pakistan) from April to December 2015.
This paper discusses that if the effective sample size is not sufficient, Bayesian survival models can be used to achieve more accurate results because this model is not affected by the sample size. The results of both methods are shown on a sample of cardiac data and based on the results of Bayesian Cox regression model, it was observed that Age, Anemia, Ejection fraction, High blood pressure and Serum creatinine were effective on patient survival.
The Bayesian models are much more accurate to determine survival and determine risk factors when dealing with data on rare diseases or diseases with low mortality, including heart patients whose survival probability is higher than that of cancer patients.
心脏病是全球主要的死因,全球每年有1700万人死于心血管疾病,因此找到影响这些患者生存的因素尤为重要。因此,找到分析患者生存情况的最佳模型有助于获得更准确的结果。
生存分析有不同的方法来评估一个或多个风险因素;经典的Kaplan-Meier方法、Cox回归、参数生存模型以及更新的模型如贝叶斯生存模型。Cox回归最为常见,通常用于时间依赖性数据,Cox回归与贝叶斯模型的主要区别在于贝叶斯模型中的先验分布会影响参数值。一些生存分析模型有特定条件,在分析数据之前需要考虑。在本文中,我们使用了来自Kaggle的一个数据集并讨论了这些条件。该数据集包含2015年4月至12月在费萨拉巴德心脏病学研究所和费萨拉巴德(巴基斯坦旁遮普省)联合医院收集的299例心力衰竭患者的医疗记录。
本文讨论了如果有效样本量不足,可以使用贝叶斯生存模型来获得更准确的结果,因为该模型不受样本量的影响。两种方法的结果都显示在一个心脏数据样本上,基于贝叶斯Cox回归模型的结果,观察到年龄、贫血、射血分数、高血压和血清肌酐对患者生存有影响。
在处理罕见疾病或低死亡率疾病的数据时,包括生存概率高于癌症患者的心脏病患者,贝叶斯模型在确定生存情况和确定风险因素方面要准确得多。