Suppr超能文献

知识图谱赋能的癌症数据分析。

Knowledge Graph-Enabled Cancer Data Analytics.

出版信息

IEEE J Biomed Health Inform. 2020 Jul;24(7):1952-1967. doi: 10.1109/JBHI.2020.2990797. Epub 2020 May 4.

Abstract

Cancer registries collect unstructured and structured cancer data for surveillance purposes which provide important insights regarding cancer characteristics, treatments, and outcomes. Cancer registry data typically (1) categorize each reportable cancer case or tumor at the time of diagnosis, (2) contain demographic information about the patient such as age, gender, and location at time of diagnosis, (3) include planned and completed primary treatment information, and (4) may contain survival outcomes. As structured data is being extracted from various unstructured sources, such as pathology reports, radiology reports, medical records, and stored for reporting and other needs, the associated information representing a reportable cancer is constantly expanding and evolving. While some popular analytic approaches including SEER*Stat and SAS exist, we provide a knowledge graph approach to organizing cancer registry data. Our approach offers unique advantages for timely data analysis and presentation and visualization of valuable information. This knowledge graph approach semantically enriches the data, and easily enables linking with third-party data which can help explain variation in cancer incidence patterns, disparities, and outcomes. We developed a prototype knowledge graph based on the Louisiana Tumor Registry dataset. We present the advantages of the knowledge graph approach by examining: i) scenario-specific queries, ii) links with openly available external datasets, iii) schema evolution for iterative analysis, and iv) data visualization. Our results demonstrate that this graph based solution can perform complex queries, improve query run-time performance by up to 76%, and more easily conduct iterative analyses to enhance researchers' understanding of cancer registry data.

摘要

癌症登记处收集非结构化和结构化的癌症数据,用于监测目的,提供有关癌症特征、治疗和结果的重要见解。癌症登记处的数据通常:(1) 在诊断时对每个可报告的癌症病例或肿瘤进行分类;(2) 包含患者的人口统计学信息,如年龄、性别和诊断时的位置;(3) 包括计划和完成的主要治疗信息;(4) 可能包含生存结果。随着结构化数据从各种非结构化来源(如病理报告、放射学报告、医疗记录)中提取并存储用于报告和其他需求,代表可报告癌症的相关信息不断扩展和发展。虽然存在一些流行的分析方法,如 SEER*Stat 和 SAS,但我们提供了一种知识图谱方法来组织癌症登记处数据。我们的方法为及时数据分析以及有价值信息的呈现和可视化提供了独特的优势。这种知识图谱方法使数据语义丰富,并轻松实现与第三方数据的链接,这有助于解释癌症发病率模式、差异和结果的变化。我们基于路易斯安那州肿瘤登记数据集开发了一个原型知识图谱。我们通过检查以下内容来展示知识图谱方法的优势:i)特定场景的查询;ii)与公开可用的外部数据集的链接;iii)用于迭代分析的模式演变;iv)数据可视化。我们的结果表明,这种基于图的解决方案可以执行复杂的查询,将查询运行时性能提高多达 76%,并且更轻松地进行迭代分析,从而增强研究人员对癌症登记处数据的理解。

相似文献

1
Knowledge Graph-Enabled Cancer Data Analytics.知识图谱赋能的癌症数据分析。
IEEE J Biomed Health Inform. 2020 Jul;24(7):1952-1967. doi: 10.1109/JBHI.2020.2990797. Epub 2020 May 4.

引用本文的文献

9
Cognitive Computing-Based CDSS in Medical Practice.医学实践中基于认知计算的临床决策支持系统
Health Data Sci. 2021 Jul 22;2021:9819851. doi: 10.34133/2021/9819851. eCollection 2021.

本文引用的文献

1
EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases.EpiK:传染病流行病学建模与分析知识库。
J Healthc Inform Res. 2017 Nov 6;1(2):260-303. doi: 10.1007/s41666-017-0010-9. eCollection 2017 Dec.
4
Neighborhood Social Determinants of Triple Negative Breast Cancer.三阴性乳腺癌的邻里社会决定因素
Front Public Health. 2019 Feb 18;7:18. doi: 10.3389/fpubh.2019.00018. eCollection 2019.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验