Department of Health Statistics, College of Public Health, Tianjin Medical University, and Department of Internal Neurology, Tianjin Huanhu Hospital, Tianjin 300070, China.
Chin Med J (Engl). 2012 Mar;125(5):851-7.
Various methods can be applied to build predictive models for the clinical data with binary outcome variable. This research aims to explore the process of constructing common predictive models, Logistic regression (LR), decision tree (DT) and multilayer perceptron (MLP), as well as focus on specific details when applying the methods mentioned above: what preconditions should be satisfied, how to set parameters of the model, how to screen variables and build accuracy models quickly and efficiently, and how to assess the generalization ability (that is, prediction performance) reliably by Monte Carlo method in the case of small sample size.
All the 274 patients (include 137 type 2 diabetes mellitus with diabetic peripheral neuropathy and 137 type 2 diabetes mellitus without diabetic peripheral neuropathy) from the Metabolic Disease Hospital in Tianjin participated in the study. There were 30 variables such as sex, age, glycosylated hemoglobin, etc. On account of small sample size, the classification and regression tree (CART) with the chi-squared automatic interaction detector tree (CHAID) were combined by means of the 100 times 5-7 fold stratified cross-validation to build DT. The MLP was constructed by Schwarz Bayes Criterion to choose the number of hidden layers and hidden layer units, alone with levenberg-marquardt (L-M) optimization algorithm, weight decay and preliminary training method. Subsequently, LR was applied by the best subset method with the Akaike Information Criterion (AIC) to make the best used of information and avoid overfitting. Eventually, a 10 to 100 times 3-10 fold stratified cross-validation method was used to compare the generalization ability of DT, MLP and LR in view of the areas under the receiver operating characteristic (ROC) curves (AUC).
The AUC of DT, MLP and LR were 0.8863, 0.8536 and 0.8802, respectively. As the larger the AUC of a specific prediction model is, the higher diagnostic ability presents, MLP performed optimally, and then followed by LR and DT in terms of 10-100 times 2-10 fold stratified cross-validation in our study. Neural network model is a preferred option for the data. However, the best subset of multiple LR would be a better choice in view of efficiency and accuracy.
When dealing with data from small size sample, multiple independent variables and a dichotomous outcome variable, more strategies and statistical techniques (such as AIC criteria, L-M optimization algorithm, the best subset, etc.) should be considered to build a forecast model and some available methods (such as cross-validation, AUC, etc.) could be used for evaluation.
对于二分类结局变量的临床数据,可以应用多种方法来构建预测模型。本研究旨在探索构建常见预测模型(Logistic 回归(LR)、决策树(DT)和多层感知器(MLP))的过程,并关注在应用上述方法时的具体细节:应满足哪些前提条件,如何设置模型参数,如何快速有效地筛选变量并构建准确的模型,以及如何在小样本量的情况下通过蒙特卡罗方法可靠地评估泛化能力(即预测性能)。
所有 274 名(包括 137 名 2 型糖尿病伴糖尿病周围神经病变患者和 137 名 2 型糖尿病无糖尿病周围神经病变患者)来自天津代谢病医院的患者参与了这项研究。有 30 个变量,如性别、年龄、糖化血红蛋白等。由于样本量较小,采用分类回归树(CART)结合卡方自动交互检测树(CHAID),通过 100 次 5-7 折分层交叉验证构建 DT。通过 Schwarz Bayes 准则构建 MLP,选择隐藏层和隐藏层单元的数量,同时采用莱文伯格-马夸尔特(L-M)优化算法、权重衰减和初步训练方法。然后,通过 Akaike 信息准则(AIC)的最佳子集方法应用 LR,以充分利用信息并避免过度拟合。最后,采用 10 到 100 次 3-10 折分层交叉验证方法,根据接收器工作特征(ROC)曲线下面积(AUC)比较 DT、MLP 和 LR 的泛化能力。
DT、MLP 和 LR 的 AUC 分别为 0.8863、0.8536 和 0.8802。由于特定预测模型的 AUC 越大,诊断能力越高,因此在我们的研究中,MLP 的表现最佳,其次是 LR 和 DT。神经网络模型是数据的首选方案。然而,从效率和准确性的角度来看,多个 LR 的最佳子集可能是更好的选择。
在处理来自小样本量、多个自变量和二分类结局变量的数据时,应考虑更多的策略和统计技术(如 AIC 标准、L-M 优化算法、最佳子集等)来构建预测模型,并使用一些可用的方法(如交叉验证、AUC 等)进行评估。