使用神经网络对删失数据和非删失数据进行条件分布函数估计

Conditional Distribution Function Estimation Using Neural Networks for Censored and Uncensored Data.

作者信息

Hu Bingqing, Nan Bin

机构信息

Department of Statistics University of California, Irvine Irvine, CA 92697, USA.

出版信息

J Mach Learn Res. 2023;24.

PMID:38249291

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10798802/

Abstract

Most work in neural networks focuses on estimating the conditional mean of a continuous response variable given a set of covariates. In this article, we consider estimating the conditional distribution function using neural networks for both censored and uncensored data. The algorithm is built upon the data structure particularly constructed for the Cox regression with time-dependent covariates. Without imposing any model assumptions, we consider a loss function that is based on the full likelihood where the conditional hazard function is the only unknown nonparametric parameter, for which unconstrained optimization methods can be applied. Through simulation studies, we show that the proposed method possesses desirable performance, whereas the partial likelihood method and the traditional neural networks with loss yields biased estimates when model assumptions are violated. We further illustrate the proposed method with several real-world data sets. The implementation of the proposed methods is made available at https://github.com/bingqing0729/NNCDE.

摘要

神经网络中的大多数工作都集中在给定一组协变量的情况下估计连续响应变量的条件均值。在本文中，我们考虑使用神经网络对删失数据和非删失数据估计条件分布函数。该算法基于专门为具有时间相依协变量的Cox回归构建的数据结构。在不施加任何模型假设的情况下，我们考虑一个基于完全似然的损失函数，其中条件风险函数是唯一未知的非参数参数，对此可以应用无约束优化方法。通过模拟研究，我们表明所提出的方法具有理想的性能，而当模型假设被违反时，偏似然方法和具有损失的传统神经网络会产生有偏估计。我们进一步用几个实际数据集说明了所提出的方法。所提出方法的实现可在https://github.com/bingqing0729/NNCDE上获取。