Kikuchi Yusuke, Kawczynski Michael G, Anegondi Neha, Neubert Ales, Dai Jian, Ferrara Daniela, Quezada-Ruiz Carlos
Roche Personalized Healthcare Program, Genentech, Inc., South San Francisco, California.
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California.
Ophthalmol Sci. 2023 Aug 18;4(2):100385. doi: 10.1016/j.xops.2023.100385. eCollection 2024 Mar-Apr.
To develop machine learning (ML) models to predict, at baseline, treatment outcomes at month 9 in patients with neovascular age-related macular degeneration (nAMD) receiving faricimab.
Retrospective proof of concept study.
Patients enrolled in the phase II AVENUE trial (NCT02484690) of faricimab in nAMD.
Baseline characteristics and spectral domain-OCT (SD-OCT) image data from 185 faricimab-treated eyes were split into 80% training and 20% test sets at the patient level. Input variables were baseline age, sex, best-corrected visual acuity (BCVA), central subfield thickness (CST), low luminance deficit, treatment arm, and SD-OCT images. A regression problem (BCVA) and a binary classification problem (reduction of CST by 35%) were considered. Overall, 10 models were developed and tested for each problem. Benchmark classical ML models (linear, random forest, extreme gradient boosting) were trained on baseline characteristics; benchmark deep neural networks (DNNs) were trained on baseline SD-OCT B-scans. Baseline characteristics and SD-OCT data were merged using 2 approaches: model stacking (using DNN prediction as an input feature for classical ML models) and model averaging (which averaged predictions from the DNN using SD-OCT volume and from classical ML models using baseline characteristics).
Treatment outcomes were defined by 2 target variables: functional (BCVA letter score) and anatomical (percent decrease in CST from baseline) outcomes at month 9.
The best-performing BCVA regression model with respect to the test coefficient of determination (R) was the linear model in the model-stacking approach with R of 0.31. The best-performing CST classification model with respect to test area under receiver operating characteristics (AUROC) was the benchmark linear model with AUROC of 0.87. A post hoc analysis showed the baseline BCVA and the baseline CST had the most effect in the all-model prediction for BCVA regression and CST classification, respectively.
Promising signals for predicting treatment outcomes from baseline characteristics were detected; however, the predictive benefit of baseline images was unclear in this proof-of-concept study. Further testing and validation with larger, independent datasets is required to fully explore the predictive capacity of ML models using baseline imaging data.
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
开发机器学习(ML)模型,以便在基线时预测接受法西单抗治疗的新生血管性年龄相关性黄斑变性(nAMD)患者在第9个月的治疗结果。
回顾性概念验证研究。
纳入法西单抗治疗nAMD的II期AVENUE试验(NCT02484690)的患者。
将185只接受法西单抗治疗的眼睛的基线特征和光谱域光学相干断层扫描(SD-OCT)图像数据在患者层面分为80%的训练集和20%的测试集。输入变量包括基线年龄、性别、最佳矫正视力(BCVA)、中心子场厚度(CST)、低亮度缺陷、治疗组以及SD-OCT图像。考虑了一个回归问题(BCVA)和一个二元分类问题(CST降低35%)。总体而言,针对每个问题开发并测试了10个模型。基准经典ML模型(线性、随机森林、极端梯度提升)基于基线特征进行训练;基准深度神经网络(DNN)基于基线SD-OCT B扫描进行训练。基线特征和SD-OCT数据使用两种方法合并:模型堆叠(将DNN预测用作经典ML模型的输入特征)和模型平均(对使用SD-OCT体积的DNN预测和使用基线特征的经典ML模型预测进行平均)。
治疗结果由两个目标变量定义:第9个月时的功能(BCVA字母评分)和解剖学(CST相对于基线的降低百分比)结果。
就测试决定系数(R)而言,表现最佳的BCVA回归模型是模型堆叠方法中的线性模型,R为0.31。就测试受试者工作特征曲线下面积(AUROC)而言,表现最佳的CST分类模型是基准线性模型,AUROC为0.87。事后分析表明,基线BCVA和基线CST分别在BCVA回归和CST分类的全模型预测中影响最大。
检测到从基线特征预测治疗结果的有前景信号;然而,在这项概念验证研究中,基线图像的预测益处尚不清楚。需要使用更大的独立数据集进行进一步测试和验证,以充分探索使用基线成像数据的ML模型的预测能力。
本文末尾的脚注和披露中可能会找到专有或商业披露信息。