Yang Che-Ning, Hsieh Yi-Ting, Yeh Hsu-Hang, Chu Hsiao-Sang, Wu Jo-Hsuan, Chen Wei-Li
School of Medicine, National Taiwan University, Taipei, Taiwan.
Department of Ophthalmology, National Taiwan University Hospital, Taipei, Taiwan.
Curr Eye Res. 2025 Mar;50(3):276-281. doi: 10.1080/02713683.2024.2430212. Epub 2024 Dec 9.
To examine the performance of deep-learning models that predicts the visual acuity after cataract surgery using preoperative clinical information and color fundus photography (CFP).
We retrospectively collected the age, sex, and logMAR preoperative best corrected visual acuity (preoperative-BCVA) and CFP from patients who underwent cataract surgeries from 2020 to 2021 at National Taiwan University Hospital. Feature extraction of CFP was performed using a pre-existing image classification model, Xception. The CFP-extracted features and pre-operative clinical information were then fed to a downstream neural network for final prediction. We assessed the model performance by calculating the mean absolute error (MAE) between the predicted and the true logMAR of postoperative BCVA. A nested 10-fold cross-validation was performed for model validation.
A total of 673 fundus images from 446 patients were collected. The mean preoperative BCVA and postoperative BCVA was 0.60 ± 0.39 and 0.14 ± 0.18, respectively. The model using age and sex as predictors achieved an MAE of 0.121 ± 0.016 in postoperative BCVA prediction. The inclusion of CFP as additional predictor in the model (predictors: age, sex and CFP) did not further improve the predictive performance (MAE = 0.120 ± 0.023, = 0.375), while adding the preoperative BCVA as an additional predictor resulted in a 4.13% improvement (predictors: age, sex and preoperative BCVA, MAE = 0.116 ± 0.016, = 0.013).
Our multimodal models including both CFP and clinical information achieved excellent accuracy in predicting BCVA after cataract surgery, while the learning models input with only clinical information performed similarly. Future studies are needed to clarify the effects of multimodal input on this task.
使用术前临床信息和彩色眼底照片(CFP)来检验预测白内障手术后视力的深度学习模型的性能。
我们回顾性收集了2020年至2021年在台湾大学附属医院接受白内障手术患者的年龄、性别、术前最佳矫正视力的对数最小分辨角(logMAR)(术前BCVA)和CFP。使用现有的图像分类模型Xception对CFP进行特征提取。然后将CFP提取的特征和术前临床信息输入到下游神经网络进行最终预测。我们通过计算预测的术后BCVA的logMAR与真实值之间的平均绝对误差(MAE)来评估模型性能。进行了嵌套的10折交叉验证以验证模型。
共收集了来自446例患者的673张眼底图像。术前BCVA和术后BCVA的平均值分别为0.60±0.39和0.14±0.18。使用年龄和性别作为预测因子的模型在术后BCVA预测中实现的MAE为0.121±0.016。在模型中加入CFP作为额外预测因子(预测因子:年龄、性别和CFP)并未进一步提高预测性能(MAE = 0.120±0.023,P = 0.375),而加入术前BCVA作为额外预测因子则带来了4.13%的改善(预测因子:年龄、性别和术前BCVA,MAE = 0.116±0.016,P = 0.013)。
我们的包括CFP和临床信息的多模态模型在预测白内障手术后的BCVA方面取得了优异的准确性,而仅输入临床信息的学习模型表现相似。未来需要开展研究以阐明多模态输入对该任务的影响。