Fecho Karamarie, Haaland Perry, Krishnamurthy Ashok, Lan Bo, Ramsey Stephen A, Schmitt Patrick L, Sharma Priya, Sinha Meghamala, Xu Hao
Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Inform Med Unlocked. 2021;26. doi: 10.1016/j.imu.2021.100733. Epub 2021 Sep 20.
The Integrated Clinical and Environmental Exposures Service (ICEES) provides regulatory-compliant open access to sensitive patient data that have been integrated with public exposures data. ICEES was designed initially to support dynamic cohort creation and bivariate contingency tests. The objective of the present study was to develop an open approach to support multivariate analyses using existing ICEES functionalities and abiding by all regulatory constraints. We first developed an open approach for generating a multivariate table that maintains contingencies between clinical and environmental variables using programmatic calls to the open ICEES application programming interface. We then applied the approach to data on a large cohort (N = 22,365) of patients with asthma or related conditions and generated an eight-feature table. Due to regulatory constraints, data loss was incurred with the incorporation of each successive feature variable, from a starting sample size of N = 22,365 to a final sample size of N = 4,556 (20.4%), but data loss was < 10% until the addition of the final two feature variables. We then applied a generalized linear model to the subsequent dataset and focused on the impact of seven select feature variables on asthma exacerbations, defined as annual emergency department or inpatient visits for respiratory issues. We identified five feature variables-sex, race, obesity, prednisone, and airborne particulate exposure-as significant predictors of asthma exacerbations. We discuss the advantages and disadvantages of ICEES open multivariate analysis and conclude that, despite limitations, ICEES can provide a valuable resource for open multivariate analysis and can serve as an exemplar for regulatory-compliant informatic solutions to open patient data, with capabilities to explore the impact of environmental exposures on health outcomes.
综合临床与环境暴露服务(ICEES)提供符合法规的开放访问,可获取已与公共暴露数据整合的敏感患者数据。ICEES最初旨在支持动态队列创建和双变量列联检验。本研究的目的是开发一种开放方法,利用ICEES现有功能并遵守所有法规限制来支持多变量分析。我们首先开发了一种开放方法,通过对开放的ICEES应用程序编程接口进行编程调用,生成一个多变量表,该表可维持临床变量和环境变量之间的列联关系。然后,我们将该方法应用于一大群(N = 22,365)患有哮喘或相关疾病的患者的数据,并生成了一个八特征表。由于法规限制,随着每个连续特征变量的纳入,数据出现了丢失,从初始样本量N = 22,365降至最终样本量N = 4,556(20.4%),但在添加最后两个特征变量之前,数据丢失率<10%。然后,我们将广义线性模型应用于后续数据集,并重点关注七个选定特征变量对哮喘加重的影响,哮喘加重定义为因呼吸问题每年到急诊科就诊或住院。我们确定了五个特征变量——性别、种族、肥胖、泼尼松和空气传播颗粒物暴露——作为哮喘加重的重要预测因素。我们讨论了ICEES开放多变量分析的优缺点,并得出结论,尽管存在局限性,但ICEES可为开放多变量分析提供有价值的资源,并且可以作为符合法规的开放患者数据信息学解决方案的范例,具备探索环境暴露对健康结果影响的能力。