Boker Steven M, Brick Timothy R, Pritikin Joshua N, Wang Yang, von Oertzen Timo, Brown Donald, Lach John, Estabrook Ryne, Hunter Michael D, Maes Hermine H, Neale Michael C
a Department of Psychology, University of Virginia.
b Department of Health and Human Development, The Pennsylvania State University.
Multivariate Behav Res. 2015;50(6):706-20. doi: 10.1080/00273171.2015.1094387.
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participant's personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual's data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of opt-in and opt-out consent, lower cost for the researcher and funding institute, and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies.
维护个体数据分布式似然估计(MIDDLE)是行为科学、社会科学和健康科学研究的一种新范式。MIDDLE方法基于一个看似不可能的理念,即数据可以由参与者私下维护,且永远不会透露给研究人员,同时仍能使统计模型得以拟合并检验科学假设。MIDDLE基于这样一个假设,即参与者的数据应属于参与者自己,由参与者控制,并由参与者自己持有。分布式似然估计是指通过将目标函数和参数向量发送到每个参与者的个人设备(如智能手机、平板电脑、计算机)来拟合统计模型,在该设备上本地计算该个体数据的似然性。只有似然值会返回给中央优化器。优化器汇总响应参与者的似然值,并选择新的参数向量,直到模型收敛。MIDDLE研究为参与者提供了显著更高的隐私保护、自动管理加入和退出同意、为研究人员和资助机构降低成本,以及更快地确定结果。此外,如果一个参与者同时选择参与多项研究并选择共享数据,这些研究可以自动访问所有研究中链接的个体层面纵向数据。