Weber Susan C, Seto Tina, Olson Cliff, Kenkare Pragati, Kurian Allison W, Das Amar K
Center for Clinical Informatics, Stanford University, USA.
AMIA Annu Symp Proc. 2012;2012:970-8. Epub 2012 Nov 3.
Comparative effectiveness research (CER) using observational data requires informatics methods for the extraction, standardization, sharing, and integration of data derived from a variety of electronic sources. In the Oncoshare project, we have developed such methods as part of a collaborative multi-institutional CER study of patterns, predictors, and outcome of breast cancer care. In this paper, we present an evaluation of the approaches we undertook and the lessons we learned in building and validating the Oncoshare data resource. Specifically, we determined that 1) the state or regional cancer registry makes the most efficient starting point for determining inclusion of subjects; 2) the data dictionary should be based on existing registry standards, such as Surveillance, Epidemiology and End Results (SEER), when applicable; 3) the Social Security Administration Death Master File (SSA DMF), rather than clinical resources, provides standardized ascertainment of mortality outcomes; and 4) CER database development efforts, despite the immediate availability of electronic data, may take as long as two years to produce validated, reliable data for research. Through our efforts using these methods, Oncoshare integrates complex, longitudinal data from multiple electronic medical records and registries and provides a rich, validated resource for research on oncology care.
利用观察性数据开展的比较效果研究(CER)需要信息学方法,以提取、标准化、共享和整合源自各种电子来源的数据。在Oncoshare项目中,作为一项关于乳腺癌护理模式、预测因素及结果的多机构合作CER研究的一部分,我们开发了此类方法。在本文中,我们对所采用的方法以及在构建和验证Oncoshare数据资源过程中吸取的经验教训进行了评估。具体而言,我们确定:1)州或地区癌症登记处是确定纳入研究对象的最有效起点;2)数据字典应在适用时基于现有登记标准,如监测、流行病学与最终结果(SEER);3)社会保障管理局死亡主文件(SSA DMF)而非临床资源可提供标准化的死亡结局确定;4)尽管电子数据可即时获取,但CER数据库开发工作可能需要长达两年时间才能产生经过验证的可靠研究数据。通过运用这些方法,Oncoshare整合了来自多个电子病历和登记处的复杂纵向数据,并为肿瘤护理研究提供了丰富且经过验证的资源。