School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China.
Math Biosci Eng. 2022 May 20;19(8):7521-7542. doi: 10.3934/mbe.2022354.
With the development of the field of survival analysis, statistical inference of right-censored data is of great importance for the study of medical diagnosis. In this study, a right-censored data survival prediction model based on an improved composite quantile regression neural network framework, called rcICQRNN, is proposed. It incorporates composite quantile regression with the loss function of a multi-hidden layer feedforward neural network, combined with an inverse probability weighting method for survival prediction. Meanwhile, the hyperparameters involved in the neural network are adjusted using the WOA algorithm, integer encoding and One-Hot encoding are implemented to encode the classification features, and the BWOA variable selection method for high-dimensional data is proposed. The rcICQRNN algorithm was tested on a simulated dataset and two real breast cancer datasets, and the performance of the model was evaluated by three evaluation metrics. The results show that the rcICQRNN-5 model is more suitable for analyzing simulated datasets. The One-Hot encoding of the WOA-rcICQRNN-30 model is more applicable to the NKI70 data. The model results are optimal for k=15 after feature selection for the METABRIC dataset. Finally, we implemented the method for cross-dataset validation. On the whole, the Cindex results using One-Hot encoding data are more stable, making the proposed rcICQRNN prediction model flexible enough to assist in medical decision making. It has practical applications in areas such as biomedicine, insurance actuarial and financial economics.
随着生存分析领域的发展,右删失数据的统计推断对于医学诊断研究非常重要。在本研究中,提出了一种基于改进的组合分位数回归神经网络框架的右删失数据生存预测模型,称为 rcICQRNN。它将组合分位数回归与多隐藏层前馈神经网络的损失函数相结合,结合生存预测的逆概率加权方法。同时,使用 WOA 算法调整神经网络中的超参数,使用整数编码和 One-Hot 编码对分类特征进行编码,并提出了用于高维数据的 BWOA 变量选择方法。rcICQRNN 算法在模拟数据集和两个真实乳腺癌数据集上进行了测试,并使用三个评估指标评估了模型的性能。结果表明,rcICQRNN-5 模型更适合分析模拟数据集。WOA-rcICQRNN-30 模型的 One-Hot 编码更适用于 NKI70 数据。在对 METABRIC 数据集进行特征选择后,k=15 时模型结果最佳。最后,我们实现了跨数据集验证的方法。总的来说,使用 One-Hot 编码数据的 Cindex 结果更稳定,使得提出的 rcICQRNN 预测模型具有足够的灵活性,可以辅助医学决策。它在生物医学、保险精算和金融经济学等领域具有实际应用。