Yang Xiao-Hui, Liao Hao-Jie, Sun Pei-Yu, Ma Jing, Wang Bing, He Yan, Xue Liu-Gen, Su Li-Min, Wang Bin-Jie
IEEE J Biomed Health Inform. 2025 Jun;29(6):3894-3905. doi: 10.1109/JBHI.2024.3379432.
Causal effect estimation of individual heterogeneity is a core issue in the field of causal inference, and its application in medicine poses an active and challenging problem. In high-risk decision-making domain such as healthcare, inappropriate treatments can have serious negative impacts on patients. Recently, machine learning-based methods have been proposed to improve the accuracy of causal effect estimation results. However, many of these methods concentrate on estimating causal effects of continuous outcome variables under binary intervention conditions, and give less consideration to multivariate intervention conditions or discrete outcome variables, thus limiting their scope of application. To tackle this issue, we combine the double machine learning framework with Light Gradient Boosting Machine (LightGBM) and propose a double LightGBM model. This model can estimate binary causal effects more accurately and in less time. Two cyclic structures were added to the model. Data correction method was introduced and improved to transform discrete outcome variables into continuous outcome variables. Multivariate Cyclic Double LightGBM model (MCD-LightGBM) was proposed to intelligently estimate multivariate treatment effects. A visual human-computer interaction system for heterogeneous causal effect estimation was designed, which can be applied to different types of data. This paper reports that the system improved the Logarithm of the Minimum Angle of Resolution (LogMAR) of visual acuity change after Vascular Endothelial Growth Factor (anti-VEGF) treatment in patients with diabetic macular degeneration. The improvement was observed in two clinical problems, from 0.05 to 0.33, and the readmission rate of diabetic patients after cure was reduced from 48.4% to 10.5%. The results above demonstrate the potential of the proposed system in predicting heterogeneous clinical drug treatment effects.
个体异质性的因果效应估计是因果推断领域的核心问题,其在医学中的应用是一个活跃且具有挑战性的问题。在医疗保健等高风险决策领域,不恰当的治疗可能会对患者产生严重的负面影响。最近,基于机器学习的方法被提出来提高因果效应估计结果的准确性。然而,这些方法中的许多都集中在估计二元干预条件下连续结果变量的因果效应,而较少考虑多变量干预条件或离散结果变量,从而限制了它们的应用范围。为了解决这个问题,我们将双机器学习框架与轻梯度提升机(LightGBM)相结合,提出了一种双LightGBM模型。该模型可以在更短的时间内更准确地估计二元因果效应。在模型中添加了两个循环结构。引入并改进了数据校正方法,将离散结果变量转换为连续结果变量。提出了多变量循环双LightGBM模型(MCD-LightGBM)来智能估计多变量治疗效果。设计了一个用于异质因果效应估计的可视化人机交互系统,该系统可应用于不同类型的数据。本文报道该系统改善了糖尿病性黄斑变性患者接受血管内皮生长因子(抗VEGF)治疗后视力变化的最小分辨角对数(LogMAR)。在两个临床问题中观察到了改善,从0.05提高到0.33,治愈后糖尿病患者的再入院率从48.4%降至10.5%。上述结果证明了所提出系统在预测异质临床药物治疗效果方面的潜力。