Pillai Rakesh N, Alex Aleena, M S Narassima, Verma Vivek, Shaji Ajil, Pavithran Keechilat, Vijaykumar D K, John Denny
Evidence Synthesis & Outcomes Research, Epi-Fractals Biosystems Technopark, 695011, Trivandrum, Kerala, India.
Great Lakes Institute of Management, Chennai, 603102, Tamil Nadu, India.
Sci Rep. 2025 Jan 8;15(1):1323. doi: 10.1038/s41598-024-83896-1.
Background Breast cancer represents a significant public health concern in India, accounting for 28% of all cancer diagnoses and imposing a substantial economic burden. This study introduces a novel approach to forecasting the number of breast cancer cases (based on prevalence rates) and estimating the associated economic impact in India using the autoregressive integrated moving average (ARIMA) model. Methods Data on the prevalence of breast cancer in India from 2000 to 2021 were obtained from the Global Burden of Disease (GBD) database. This dataset provided annual estimates of the number of patients with breast cancer, serving as the basis for modeling future prevalence and estimating the economic burden. The ARIMA (Auto-Regressive Integrated Moving Average) model was employed to analyze and predict breast cancer prevalence in India up to the year 2030 (time-series forecasting). Data were visualized and checked for stationarity using the Augmented Dickey-Fuller (ADF) test. Using the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots, the appropriate parameters (p, d, q) were determined. Several ARIMA configurations were tested to identify the model with the best fit. The goodness-of-fit of the model was assessed using standard metrics such as the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The residuals were tested using the Box-Ljung test to confirm the absence of autocorrelation and verify that they followed a white noise distribution. Using the fitted ARIMA model, prevalence rates were forecasted from 2022 to 2030, with 95% confidence intervals to capture prediction uncertainty. Direct costs were calculated based on medical expenses for breast cancer patients, such as hospital visits, diagnostic tests, treatment costs, and follow-up care. A bottom-up approach was applied, which involves aggregating individual cost components from each stage of care to estimate the total direct burden of disease. A bottom-up approach was applied, which involves aggregating individual cost components from each stage of care to estimate the total direct burden of disease. Indirect costs were estimated using the human capital approach, which assesses productivity losses due to morbidity and premature mortality. The Disability-Adjusted Life Years (DALY) associated with breast cancer were also predicted using the ARIMA model. Results The results of coefficient of determination (0.99), mean absolute percentage error (69%), mean absolute error (5229), and root mean squared error (6451.2) showed that the ARIMA (0,2,0) model fitted well. Coefficient of determination (0.99) indicated that 99% of the variance in the data was explained by the model. Akaike information criterion (411.54) and Bayesian information criterion (412.53) indicated the ARIMA (0,2,0) model was reliable when analysing our data. The result of the relative error of prediction (2.76%) also suggested that the model predicted well. The number of patients with breast cancer from 2021 to 2030 was predicted to be about 1.25 million, 1.1.29 million,, 1.34 million, 1.39 million, 1.44 million, 1.48 million, 1.53 million, 1.58 million, 1.63 million, 1.68 million, and respectively. The total economic burden of breast cancer from 2021 to 2030 was estimated to be $8 billion, $8.72 billion, $9.05 billion, $9.84 billion, $10.20 billion, $11.07 billion, $11.49 billion, $12.44 billion, $12.91 billion, $13.95 billion, respectively is estimated to rise significantly. Conclusion Breast cancer prevalence and its economic impact are projected to grow substantially in India. Between 2021 and 2030, the number of breast cancer patients is expected to increase by approximately 0.05 million annually, with an annual increase rate of about 5.6%. The associated economic burden will also rise, averaging an additional $19.55 billion per year, underscoring the need for intensified healthcare and economic strategies to manage this growing challenge.
乳腺癌是印度一个重大的公共卫生问题,占所有癌症诊断病例的28%,并带来了巨大的经济负担。本研究引入了一种新方法,即使用自回归积分移动平均(ARIMA)模型来预测印度乳腺癌病例数(基于患病率)并估计相关的经济影响。
从全球疾病负担(GBD)数据库中获取了2000年至2021年印度乳腺癌患病率的数据。该数据集提供了每年乳腺癌患者数量的估计值,作为模拟未来患病率和估计经济负担的基础。采用ARIMA(自回归积分移动平均)模型对印度截至2030年的乳腺癌患病率进行分析和预测(时间序列预测)。使用增广迪基-富勒(ADF)检验对数据进行可视化并检查其平稳性。通过自相关函数(ACF)和偏自相关函数(PACF)图确定合适的参数(p、d、q)。测试了几种ARIMA配置以识别拟合最佳的模型。使用赤池信息准则(AIC)和贝叶斯信息准则(BIC)等标准指标评估模型的拟合优度。使用Box-Ljung检验对残差进行检验,以确认不存在自相关并验证其遵循白噪声分布。使用拟合的ARIMA模型预测2022年至2030年的患病率,并给出95%的置信区间以捕捉预测不确定性。基于乳腺癌患者的医疗费用计算直接成本,如就诊、诊断测试、治疗成本和后续护理。采用自下而上的方法,即汇总每个护理阶段的个体成本组成部分来估计疾病的总直接负担。采用自下而上的方法,即汇总每个护理阶段的个体成本组成部分来估计疾病的总直接负担。使用人力资本方法估计间接成本,该方法评估由于发病和过早死亡导致的生产力损失。还使用ARIMA模型预测了与乳腺癌相关的伤残调整生命年(DALY)。
决定系数(0.99)、平均绝对百分比误差(69%)、平均绝对误差(5229)和均方根误差(6451.2)的结果表明,ARIMA(0,2,0)模型拟合良好。决定系数(0.99)表明模型解释了数据中99%的方差。赤池信息准则(411.54)和贝叶斯信息准则(412.53)表明,在分析我们的数据时,ARIMA(0,2,0)模型是可靠的。预测相对误差(2.76%)的结果也表明该模型预测良好。预计2021年至2030年乳腺癌患者数量分别约为125万、129万、134万、139万、144万、148万、153万、158万、163万、168万。估计2021年至2030年乳腺癌的总经济负担分别为80亿美元、87.2亿美元、90.5亿美元、98.4亿美元、102.0亿美元、110.7亿美元、114.9亿美元、124.4亿美元、129.1亿美元、139.5亿美元,预计将显著上升。
预计印度乳腺癌患病率及其经济影响将大幅增长。在2021年至2030年期间,乳腺癌患者数量预计每年增加约5万,年增长率约为5.6%。相关的经济负担也将上升,平均每年额外增加195.5亿美元,这凸显了加强医疗保健和经济战略以应对这一日益严峻挑战的必要性。