Dong Xiao, Randolph David A
Center for Clinical and Translational Science, University of Illinois at Chicago, Chicago, Illinois, USA.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:162-170. eCollection 2020.
Reliable cohort discovery is an essential early part of clinical study design. Indeed, it is the defining feature of many clinical research networks, including the recently launched Accrual to Clinical Trials (ACT) network. As currently deployed, however, the ACT network only allows cohort queries in isolated silos, rendering cohort discovery across sites unreliable. Here we demonstrate a novel protocol to provide network participants access to more accurate combined cohort estimates (union cardinality) with other sites. A two-party Elgamal protocol is implemented to ensure privacy and security imperatives, and a special attribute of Bloom filters is exploited for accurate and fast cardinality estimates. To emulate mandatory privacy protecting obfuscation factors (like those applied to the counts reported for individual sites by ACT), we configure the Bloom filter based on the individual site cohort sizes, striking an appropriate balance between accuracy and privacy. Finally, we discuss additional approval and data governance steps required to incorporate our protocol in the current ACT infrastructure.
可靠的队列发现是临床研究设计早期必不可少的一部分。事实上,它是许多临床研究网络的决定性特征,包括最近启动的临床试验入组(ACT)网络。然而,按照目前的部署方式,ACT网络仅允许在孤立的筒仓中进行队列查询,使得跨站点的队列发现不可靠。在此,我们展示了一种新颖的协议,可为网络参与者提供与其他站点更准确的联合队列估计值(并集基数)。实施了一种两方Elgamal协议以确保隐私和安全要求,并利用布隆过滤器的一个特殊属性进行准确快速的基数估计。为了模拟强制性隐私保护混淆因素(如ACT应用于各个站点报告计数的那些因素),我们根据各个站点的队列大小配置布隆过滤器,在准确性和隐私之间取得适当平衡。最后,我们讨论了将我们的协议纳入当前ACT基础设施所需的额外审批和数据治理步骤。