Spicker Dylan, Moodie Erica E M, Shortreed Susan M
Department of Mathematics and Statistics, University of New Brunswick (Saint John), NB, Canada.
Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, QC, Canada.
Stat. 2024;13(1). doi: 10.1002/sta4.641. Epub 2024 Jan 17.
Precision medicine is a framework for developing evidence-based medical recommendations that seeks to determine the optimal sequence of treatments tailored to all of the relevant patient-level characteristics which are observable. Because precision medicine relies on highly sensitive, patient-level data, ensuring the privacy of participants is of great importance. Dynamic treatment regimes (DTRs) provide one formalization of precision medicine in a longitudinal setting. Outcome-Weighted Learning (OWL) is a family of techniques for estimating optimal DTRs based on observational data. OWL techniques leverage support vector machine (SVM) classifiers in order to perform estimation. SVMs perform classification based on a set of influential points in the data known as support vectors. The classification rule produced by SVMs often requires direct access to the support vectors. Thus, releasing a treatment policy estimated with OWL requires the release of patient data for a subset of patients in the sample. As a result, the classification rules from SVMs constitute a severe privacy violation for those individuals whose data comprise the support vectors. This privacy violation is a major concern, particularly in light of the potentially highly sensitive medical data which are used in DTR estimation. Differential privacy has emerged as a mathematical framework for ensuring the privacy of individual-level data, with provable guarantees on the likelihood that individual characteristics can be determined by an adversary. We provide the first investigation of differential privacy in the context of DTRs and provide a differentially private OWL estimator, with theoretical results allowing us to quantify the cost of privacy in terms of the accuracy of the private estimators.
精准医学是一个用于制定循证医学建议的框架,旨在确定针对所有可观察到的相关患者层面特征量身定制的最佳治疗顺序。由于精准医学依赖于高度敏感的患者层面数据,确保参与者的隐私至关重要。动态治疗方案(DTR)在纵向环境中提供了精准医学的一种形式化。结果加权学习(OWL)是一类基于观察数据估计最佳DTR的技术。OWL技术利用支持向量机(SVM)分类器进行估计。SVM基于数据中一组称为支持向量的有影响的点进行分类。SVM产生的分类规则通常需要直接访问支持向量。因此,发布用OWL估计的治疗策略需要发布样本中一部分患者的患者数据。结果,SVM的分类规则对那些数据构成支持向量的个体构成了严重的隐私侵犯。这种隐私侵犯是一个主要问题,特别是考虑到DTR估计中使用的可能高度敏感的医疗数据。差分隐私已成为确保个体层面数据隐私的数学框架,对对手能够确定个体特征的可能性有可证明的保证。我们首次在DTR的背景下研究差分隐私,并提供一个差分隐私的OWL估计器,理论结果使我们能够根据私有估计器的准确性来量化隐私成本。