Jiang Xiaoqian, Wu Yuan, Marsolo Keith, Ohno-Machado Lucila
University of California, San Diego.
Duke University.
EGEMS (Wash DC). 2014 Dec 26;2(1):1053. doi: 10.13063/2327-9214.1053. eCollection 2014.
We describe functional specifications and practicalities in the software development process for a web service that allows the construction of the multivariate logistic regression model, Grid Logistic Regression (GLORE), by aggregating partial estimates from distributed sites, with no exchange of patient-level data.
We recently developed and published a web service for model construction and data analysis in a distributed environment. This recent paper provided an overview of the system that is useful for users, but included very few details that are relevant for biomedical informatics developers or network security personnel who may be interested in implementing this or similar systems. We focus here on how the system was conceived and implemented.
We followed a two-stage development approach by first implementing the backbone system and incrementally improving the user experience through interactions with potential users during the development. Our system went through various stages such as concept proof, algorithm validation, user interface development, and system testing. We used the Zoho Project management system to track tasks and milestones. We leveraged Google Code and Apache Subversion to share code among team members, and developed an applet-servlet architecture to support the cross platform deployment.
During the development process, we encountered challenges such as Information Technology (IT) infrastructure gaps and limited team experience in user-interface design. We figured out solutions as well as enabling factors to support the translation of an innovative privacy-preserving, distributed modeling technology into a working prototype.
Using GLORE (a distributed model that we developed earlier) as a pilot example, we demonstrated the feasibility of building and integrating distributed modeling technology into a usable framework that can support privacy-preserving, distributed data analysis among researchers at geographically dispersed institutes.
我们描述了一种网络服务软件开发过程中的功能规格和实际情况,该网络服务允许通过汇总来自分布式站点的部分估计值来构建多变量逻辑回归模型——网格逻辑回归(GLORE),且无需交换患者层面的数据。
我们最近开发并发布了一种用于分布式环境中模型构建和数据分析的网络服务。最近的这篇论文提供了一个对用户有用的系统概述,但包含的与生物医学信息学开发者或可能对实现此系统或类似系统感兴趣的网络安全人员相关的细节非常少。我们在此重点关注该系统是如何构思和实现的。
我们采用了两阶段开发方法,首先实现骨干系统,然后在开发过程中通过与潜在用户的交互逐步改善用户体验。我们的系统经历了概念验证、算法验证、用户界面开发和系统测试等各个阶段。我们使用Zoho项目管理系统来跟踪任务和里程碑。我们利用谷歌代码和Apache Subversion在团队成员之间共享代码,并开发了一个小程序 - servlet架构来支持跨平台部署。
在开发过程中,我们遇到了诸如信息技术(IT)基础设施差距以及团队在用户界面设计方面经验有限等挑战。我们找出了解决方案以及促成因素,以支持将一种创新的隐私保护分布式建模技术转化为一个可行的原型。
以GLORE(我们之前开发的一种分布式模型)为例,我们证明了构建分布式建模技术并将其集成到一个可用框架中的可行性,该框架能够支持地理上分散的机构的研究人员进行隐私保护的分布式数据分析。