Peterson Leif E, Coleman Matthew A
Center for Biostatistics, The Methodist Hospital Research Institute, Houston, TX 77030, USA.
Int J Data Min Bioinform. 2009;3(4):382-97.
Random Spherical Linear Oracles (RSLO) for DNA microarray gene expression data are proposed for classifier fusion. RSLO employs random hyperplane splits of samples in the principal component score space based on the first three principal components (X, Y, Z) of the input feature set. Hyperplane splits are used to assign training(testing) samples to separate logistic regression mini-classifiers, which increases the diversity of voting results since errors are not shared across mini-classifiers. We recommend use of RSLO with 3-4 10-fold CV and re-partitioning samples randomly every ten iterations prior to each 10-fold CV. This equates to a total of 30-40 iterations.
我们提出了用于DNA微阵列基因表达数据的随机球面线性预言机(RSLO)进行分类器融合。RSLO基于输入特征集的前三个主成分(X、Y、Z)在主成分得分空间中对样本进行随机超平面分割。超平面分割用于将训练(测试)样本分配到不同的逻辑回归小型分类器中,由于错误不会在小型分类器之间共享,这增加了投票结果的多样性。我们建议使用RSLO进行3-4次10折交叉验证,并在每次10折交叉验证之前每十次迭代随机重新划分样本。这相当于总共30-40次迭代。