Suppr超能文献

评估扫描外部文件对真实世界数据提取完整性的贡献。

Assessing the Contribution of Scanned Outside Documents to the Completeness of Real-World Data Abstraction.

机构信息

Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, FL.

Department of Health Informatics, Moffitt Cancer Center, Tampa, FL.

出版信息

JCO Clin Cancer Inform. 2023 Feb;7:e2200118. doi: 10.1200/CCI.22.00118.

Abstract

PURPOSE

Electronic health record (EHR) data are widely used in precision medicine, quality improvement, disease surveillance, and population health management. However, a significant amount of EHR data are stored in unstructured formats including scanned documents external to the treatment facility presenting an informatics challenge for secondary use. Studies are needed to characterize the clinical information uniquely available in scanned outside documents (SODs) to understand to what extent the availability of such information affects the use of these real-world data for cancer research.

MATERIALS AND METHODS

Two independent EHR data abstractions capturing 30 variables commonly used in oncology research were conducted for 125 patients treated for advanced non-small-cell lung cancer at a comprehensive cancer center, with and without consideration of SODs. Completeness and concordance were compared between the two abstractions, overall, and by patient groups and variable types.

RESULTS

The overall completeness of the data with SODs was 77.6% as compared with 54.3% for the abstraction without SODs. The differences in completeness were driven by data related to biomarker tests, which were more likely to be uniquely available in SODs. Such data were prone to missingness among patients who were diagnosed externally.

CONCLUSION

There were no major differences in completeness between the two abstractions by demographics, diagnosis, disease progression, performance status, or oral therapy use. However, biomarker data were more likely to be uniquely contained in the SODs. Our findings may help cancer centers prioritize the types of SOD data being abstracted for research or other secondary purposes.

摘要

目的

电子健康记录(EHR)数据被广泛应用于精准医学、质量改进、疾病监测和人群健康管理。然而,大量的 EHR 数据以非结构化格式存储,包括治疗机构外部的扫描文档,这给二次利用带来了信息学挑战。需要研究在扫描外部文档(SOD)中唯一可用的临床信息,以了解此类信息的可用性在何种程度上影响了这些真实世界数据在癌症研究中的应用。

材料和方法

对 125 名在综合性癌症中心接受晚期非小细胞肺癌治疗的患者进行了两次独立的 EHR 数据提取,共 30 个变量,这些变量常用于肿瘤学研究,同时考虑了 SOD 和不考虑 SOD 的情况。比较了两次提取在整体以及根据患者群体和变量类型的完整性和一致性。

结果

考虑 SOD 的情况下,数据的整体完整性为 77.6%,而不考虑 SOD 的情况下为 54.3%。完整性的差异是由与生物标志物检测相关的数据驱动的,这些数据更有可能在 SOD 中唯一可用。对于在外部诊断的患者,这些数据更容易缺失。

结论

两次提取在人口统计学、诊断、疾病进展、表现状态或口服治疗使用方面的完整性没有显著差异。然而,生物标志物数据更有可能仅包含在 SOD 中。我们的发现可能有助于癌症中心为研究或其他二次用途优先提取 SOD 数据的类型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验