Nam Sang Min, Peterson Thomas A, Butte Atul J, Seo Kyoung Yul, Han Hyun Wook
Department of Ophthalmology, CHA Bundang Medical Center, CHA University, Seongnam, Republic of Korea.
Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, United States.
JMIR Med Inform. 2020 Feb 20;8(2):e16153. doi: 10.2196/16153.
Dry eye disease (DED) is a complex disease of the ocular surface, and its associated factors are important for understanding and effectively treating DED.
This study aimed to provide an integrative and personalized model of DED by making an explanatory model of DED using as many factors as possible from the Korea National Health and Nutrition Examination Survey (KNHANES) data.
Using KNHANES data for 2012 (4391 sample cases), a point-based scoring system was created for ranking factors associated with DED and assessing patient-specific DED risk. First, decision trees and lasso were used to classify continuous factors and to select important factors, respectively. Next, a survey-weighted multiple logistic regression was trained using these factors, and points were assigned using the regression coefficients. Finally, network graphs of partial correlations between factors were utilized to study the interrelatedness of DED-associated factors.
The point-based model achieved an area under the curve of 0.70 (95% CI 0.61-0.78), and 13 of 78 factors considered were chosen. Important factors included sex (+9 points for women), corneal refractive surgery (+9 points), current depression (+7 points), cataract surgery (+7 points), stress (+6 points), age (54-66 years; +4 points), rhinitis (+4 points), lipid-lowering medication (+4 points), and intake of omega-3 (0.43%-0.65% kcal/day; -4 points). Among these, the age group 54 to 66 years had high centrality in the network, whereas omega-3 had low centrality.
Integrative understanding of DED was possible using the machine learning-based model and network-based factor analysis. This method for finding important risk factors and identifying patient-specific risk could be applied to other multifactorial diseases.
干眼症(DED)是一种复杂的眼表疾病,其相关因素对于理解和有效治疗干眼症至关重要。
本研究旨在通过利用韩国国家健康与营养检查调查(KNHANES)数据中的尽可能多的因素构建干眼症的解释模型,从而提供一个综合的、个性化的干眼症模型。
使用2012年的KNHANES数据(4391个样本病例),创建了一个基于点数的评分系统,用于对与干眼症相关的因素进行排名,并评估患者特定的干眼症风险。首先,使用决策树和套索分别对连续因素进行分类并选择重要因素。接下来,使用这些因素训练调查加权多元逻辑回归,并使用回归系数分配点数。最后,利用因素之间的偏相关网络图来研究干眼症相关因素的相互关联性。
基于点数的模型的曲线下面积为0.70(95%CI 0.61 - 0.78),在所考虑的78个因素中选择了13个。重要因素包括性别(女性加9分)、角膜屈光手术(加9分)、当前抑郁(加7分)、白内障手术(加7分)、压力(加6分)、年龄(54 - 66岁;加4分)、鼻炎(加4分)、降脂药物(加4分)以及ω-3摄入量(0.43% - 0.65%千卡/天;减4分)。其中,54至66岁年龄组在网络中的中心性较高,而ω-3的中心性较低。
使用基于机器学习的模型和基于网络的因素分析可以对干眼症进行综合理解。这种寻找重要危险因素和识别患者特定风险的方法可应用于其他多因素疾病。