Wu Yuan, Jiang Xiaoqian, Wang Shuang, Jiang Wenchao, Li Pinghao, Ohno-Machado Lucila
Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, 27708, USA.
Division of Biomedical Informatics, Department of Medicine, University of California, San Diego, La Jolla, CA, 92093, USA.
BMC Med Inform Decis Mak. 2015 Feb 18;15:10. doi: 10.1186/s12911-015-0133-y.
Multi-category response models are very important complements to binary logistic models in medical decision-making. Decomposing model construction by aggregating computation developed at different sites is necessary when data cannot be moved outside institutions due to privacy or other concerns. Such decomposition makes it possible to conduct grid computing to protect the privacy of individual observations.
This paper proposes two grid multi-category response models for ordinal and multinomial logistic regressions. Grid computation to test model assumptions is also developed for these two types of models. In addition, we present grid methods for goodness-of-fit assessment and for classification performance evaluation.
Simulation results show that the grid models produce the same results as those obtained from corresponding centralized models, demonstrating that it is possible to build models using multi-center data without losing accuracy or transmitting observation-level data. Two real data sets are used to evaluate the performance of our proposed grid models.
The grid fitting method offers a practical solution for resolving privacy and other issues caused by pooling all data in a central site. The proposed method is applicable for various likelihood estimation problems, including other generalized linear models.
在医学决策中,多类别响应模型是二元逻辑模型的重要补充。当由于隐私或其他问题数据不能在机构间转移时,通过聚合不同地点开发的计算来分解模型构建是必要的。这种分解使得进行网格计算以保护个体观测值的隐私成为可能。
本文针对有序和多项逻辑回归提出了两种网格多类别响应模型。还针对这两种模型开发了用于检验模型假设的网格计算。此外,我们提出了用于拟合优度评估和分类性能评估的网格方法。
模拟结果表明,网格模型产生的结果与相应集中式模型得到的结果相同,这表明使用多中心数据构建模型而不损失准确性或传输观测级数据是可行的。使用两个真实数据集来评估我们提出的网格模型的性能。
网格拟合方法为解决在中心站点汇集所有数据所引起的隐私和其他问题提供了一种实际解决方案。所提出的方法适用于各种似然估计问题,包括其他广义线性模型。