Goldbaum Michael H, Falkenstein Irina, Kozak Igor, Hao Jiucang, Bartsch Dirk-Uwe, Sejnowski Terrance, Freeman William R
Ophthalmic Informatics Laboratory, University of California at San Diego, USA.
Trans Am Ophthalmol Soc. 2008;106:196-204; discussion 204-5.
To test the following hypotheses: (1) eyes from individuals with human immunodeficiency virus (HIV) have electrophysiologic abnormalities that manifest as multifocal electroretinogram (mfERG) abnormalities; (2) the retinal effects of HIV in immune-competent HIV individuals differ from the effects in immune-incompetent HIV individuals; (3) strong machine learning classifiers (MLCs), like support vector machine (SVM), can learn to use mfERG abnormalities in the second-order kernel (SOK) to distinguish HIV from normal eyes; and (4) the mfERG abnormalities fall into patterns that can be discerned by MLCs. We applied a supervised MLC, SVM, to determine if mfERGs in eyes from patients with HIV differ from mfERGs in HIV-negative controls.
Ninety-nine HIV-positive patients without visible retinopathy were divided into 2 groups: (1) 59 high-CD4 individuals (H, 104 eyes), 48.5 +/- 7.7 years, whose CD4 counts were never observed below 100, and (2) 40 low-CD4 individuals (L, 61 eyes), 46.2 +/- 5.6 years, whose CD4 counts were below 100 for at least 6 months. The normal group (N, 82 eyes) had 41 age-matched HIV-negative individuals, 46.8 +/- 6.2 years. The amplitude and latency of the first positive curve (P1, hereafter referred to as a) and the first negative curve (N1, referred to as b) in the SOK of 103 hexagon patterns of the central 28 degrees of the retina were recorded from the eyes in each group. SVM was trained and tested with cross-validation to distinguish H from N and L from N. SOK was chosen as a presumed detector of inner retinal abnormalities. Classifier performance was measured with the area under the receiver operating characteristic (AUROC) curve to permit comparison of MLCs. Improvement in performance and identification of subsets of the most important features were sought with feature selection by backward elimination.
In general, the SOK b-parameters separated L from N and H from N better than a-parameters, and latency separated L from N and H from N better than amplitude. In the HIV groups, on average, amplitude was diminished and latency was extended. The parameter that most consistently separated L from N and H from N was b-latency. With b-latency, SVM learned to distinguish L from N (AUROC = 0.7.30 +/- 0.044, P = .001 against chance [0.500 +/- 0.051]) and H from N (0.732 +/- 0.038, P = .0001 against chance) equally well. With best-performing subsets (21 out of 103 hexagons) derived by backward elimination, SVM distinguished L from N (0.869 +/- 0.030, P < .00005 against chance) and H from N (0.859 +/- 0.029, P <.00005 against chance) better than SVM with the full set of hexagons. Mapping the top 10 hexagon locations for L vs N and H vs N produced no apparent pattern.
This study confirms that mfERG SOK abnormalities develop in the retina of HIV-positive individuals. The new finding of equal severity of b-latency abnormalities in the low- and high-CD4 groups indicates that good immune status under highly active antiretroviral therapy may not protect against retinal damage and, by extension, damage elsewhere. SOKs are difficult for human experts to interpret. Machine learning classifiers, such as SVM, learn from the data without human intervention, reducing the need to rely on human skills to interpret this test.
验证以下假设:(1)人类免疫缺陷病毒(HIV)感染者的眼睛存在电生理异常,表现为多焦视网膜电图(mfERG)异常;(2)免疫功能正常的HIV感染者与免疫功能低下的HIV感染者的视网膜病变情况不同;(3)强大的机器学习分类器(MLC),如支持向量机(SVM),能够利用二阶核(SOK)中的mfERG异常来区分HIV感染者与正常人的眼睛;(4)mfERG异常呈现出可被MLC识别的模式。我们应用了一种监督式MLC,即SVM,来确定HIV患者眼睛的mfERG与HIV阴性对照者的mfERG是否存在差异。
99例无明显视网膜病变的HIV阳性患者被分为两组:(1)59例高CD4个体(H组,104只眼),年龄48.5±7.7岁,其CD4计数从未低于100;(2)40例低CD4个体(L组,61只眼),年龄46.2±5.6岁,其CD4计数低于100至少6个月。正常组(N组,82只眼)有41例年龄匹配的HIV阴性个体,年龄46.8±6.2岁。记录每组眼睛视网膜中央28度的103个六边形模式的SOK中第一个正向波(P1,以下简称a)和第一个负向波(N1,简称b)的振幅和潜伏期。使用交叉验证对SVM进行训练和测试,以区分H组与N组以及L组与N组。选择SOK作为视网膜内层异常的假定检测指标。通过接收器操作特征(AUROC)曲线下面积来衡量分类器性能,以比较MLC。通过反向消除进行特征选择,以寻求性能提升和识别最重要特征的子集。
总体而言,SOK的b参数比a参数能更好地区分L组与N组以及H组与N组,潜伏期比振幅能更好地区分L组与N组以及H组与N组。在HIV组中,平均而言,振幅降低,潜伏期延长。最能一致区分L组与N组以及H组与N组的参数是b潜伏期。利用b潜伏期,SVM能够同样良好地区分L组与N组(AUROC = 0.730±0.044,与随机概率[0.500±0.051]相比,P = 0.001)以及H组与N组(0.732±0.038,与随机概率相比,P = 0.0001)。通过反向消除得出的最佳性能子集(103个六边形中的21个),SVM区分L组与N组(0.869±