Strimenopoulou Foteini, Brown Philip J
University of Kent.
Stat Appl Genet Mol Biol. 2008;7(2):Article9. doi: 10.2202/1544-6115.1359. Epub 2008 Feb 21.
We construct a diagnostic predictor for patient disease status based on a single data set of mass spectra of serum samples together with the binary case-control response. The model is logistic regression with Bernoulli log-likelihood augmented either by quadratic ridge or absolute L1 penalties. For ridge penalization using the singular value decomposition we reduce the number of variables for maximization to the rank of the design matrix. With log-likelihood loss, 10-fold cross-validatory choice is employed to specify the penalization hyperparameter. Predictive ability is judged on a set-aside subset of the data.
我们基于血清样本质谱的单个数据集以及二元病例对照反应构建了一个用于预测患者疾病状态的诊断预测模型。该模型是具有伯努利对数似然的逻辑回归,通过二次岭回归或绝对L1惩罚进行增强。对于使用奇异值分解的岭惩罚,我们将最大化变量的数量减少到设计矩阵的秩。通过对数似然损失,采用10折交叉验证选择来指定惩罚超参数。预测能力在数据的留出子集中进行判断。