• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

知识图谱赋能的癌症数据分析。

Knowledge Graph-Enabled Cancer Data Analytics.

出版信息

IEEE J Biomed Health Inform. 2020 Jul;24(7):1952-1967. doi: 10.1109/JBHI.2020.2990797. Epub 2020 May 4.

DOI:10.1109/JBHI.2020.2990797
PMID:32386166
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8324069/
Abstract

Cancer registries collect unstructured and structured cancer data for surveillance purposes which provide important insights regarding cancer characteristics, treatments, and outcomes. Cancer registry data typically (1) categorize each reportable cancer case or tumor at the time of diagnosis, (2) contain demographic information about the patient such as age, gender, and location at time of diagnosis, (3) include planned and completed primary treatment information, and (4) may contain survival outcomes. As structured data is being extracted from various unstructured sources, such as pathology reports, radiology reports, medical records, and stored for reporting and other needs, the associated information representing a reportable cancer is constantly expanding and evolving. While some popular analytic approaches including SEER*Stat and SAS exist, we provide a knowledge graph approach to organizing cancer registry data. Our approach offers unique advantages for timely data analysis and presentation and visualization of valuable information. This knowledge graph approach semantically enriches the data, and easily enables linking with third-party data which can help explain variation in cancer incidence patterns, disparities, and outcomes. We developed a prototype knowledge graph based on the Louisiana Tumor Registry dataset. We present the advantages of the knowledge graph approach by examining: i) scenario-specific queries, ii) links with openly available external datasets, iii) schema evolution for iterative analysis, and iv) data visualization. Our results demonstrate that this graph based solution can perform complex queries, improve query run-time performance by up to 76%, and more easily conduct iterative analyses to enhance researchers' understanding of cancer registry data.

摘要

癌症登记处收集非结构化和结构化的癌症数据,用于监测目的,提供有关癌症特征、治疗和结果的重要见解。癌症登记处的数据通常:(1) 在诊断时对每个可报告的癌症病例或肿瘤进行分类;(2) 包含患者的人口统计学信息,如年龄、性别和诊断时的位置;(3) 包括计划和完成的主要治疗信息;(4) 可能包含生存结果。随着结构化数据从各种非结构化来源(如病理报告、放射学报告、医疗记录)中提取并存储用于报告和其他需求,代表可报告癌症的相关信息不断扩展和发展。虽然存在一些流行的分析方法,如 SEER*Stat 和 SAS,但我们提供了一种知识图谱方法来组织癌症登记处数据。我们的方法为及时数据分析以及有价值信息的呈现和可视化提供了独特的优势。这种知识图谱方法使数据语义丰富,并轻松实现与第三方数据的链接,这有助于解释癌症发病率模式、差异和结果的变化。我们基于路易斯安那州肿瘤登记数据集开发了一个原型知识图谱。我们通过检查以下内容来展示知识图谱方法的优势:i)特定场景的查询;ii)与公开可用的外部数据集的链接;iii)用于迭代分析的模式演变;iv)数据可视化。我们的结果表明,这种基于图的解决方案可以执行复杂的查询,将查询运行时性能提高多达 76%,并且更轻松地进行迭代分析,从而增强研究人员对癌症登记处数据的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/1b2534c6993f/nihms-1608706-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/c2b5ffd4244a/nihms-1608706-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/ee22da1d9ff4/nihms-1608706-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/d3b7af747902/nihms-1608706-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/3fb8a7244b0d/nihms-1608706-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/18ec1ebd74b5/nihms-1608706-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/a0a83e2ec988/nihms-1608706-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/8e150822f1ae/nihms-1608706-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/34370b8631a7/nihms-1608706-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/1b2534c6993f/nihms-1608706-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/c2b5ffd4244a/nihms-1608706-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/ee22da1d9ff4/nihms-1608706-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/d3b7af747902/nihms-1608706-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/3fb8a7244b0d/nihms-1608706-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/18ec1ebd74b5/nihms-1608706-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/a0a83e2ec988/nihms-1608706-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/8e150822f1ae/nihms-1608706-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/34370b8631a7/nihms-1608706-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebcc/8324069/1b2534c6993f/nihms-1608706-f0009.jpg

相似文献

1
Knowledge Graph-Enabled Cancer Data Analytics.知识图谱赋能的癌症数据分析。
IEEE J Biomed Health Inform. 2020 Jul;24(7):1952-1967. doi: 10.1109/JBHI.2020.2990797. Epub 2020 May 4.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Analysis and visualization of disease courses in a semantically-enabled cancer registry.在语义增强型癌症登记处对疾病病程进行分析和可视化。
J Biomed Semantics. 2017 Sep 29;8(1):46. doi: 10.1186/s13326-017-0154-9.
4
Plasma Cell Myeloma - 20-Year Comparative Survival and Mortality of Three Plasma Cell Myeloma ICD-O-3 Oncologic Phenotypes by Age, Sex, Race, Stage, Cohort Entry Time-Period and Disease Duration: A Systematic Review of 111,041 Cases for Diagnosis Years 1973-2014: (SEER*Stat 8.3.4).浆细胞骨髓瘤——按年龄、性别、种族、分期、队列入组时间和疾病持续时间对三种浆细胞骨髓瘤国际疾病分类肿瘤学形态进行的20年生存和死亡率比较:对1973 - 2014年诊断的111,041例病例的系统评价:(监测、流行病学和最终结果*统计8.3.4)
J Insur Med. 2018;47(4):203-211. doi: 10.17849/insm-47-04-1-9.1. Epub 2019 Jan 22.
5
Italian cancer figures--Report 2015: The burden of rare cancers in Italy.意大利癌症数据——2015年报告:意大利罕见癌症的负担
Epidemiol Prev. 2016 Jan-Feb;40(1 Suppl 2):1-120. doi: 10.19191/EP16.1S2.P001.035.
6
Graph4Med: a web application and a graph database for visualizing and analyzing medical databases.Graph4Med:一个用于可视化和分析医学数据库的网络应用程序和图数据库。
BMC Bioinformatics. 2022 Dec 12;23(1):537. doi: 10.1186/s12859-022-05092-0.
7
Prevalence of amyotrophic lateral sclerosis - United States, 2010-2011.2010 - 2011年美国肌萎缩侧索硬化症的患病率
MMWR Suppl. 2014 Jul 25;63(7):1-14.
8
Exploring Integrative Analysis Using the BioMedical Evidence Graph.探索使用生物医学证据图谱的综合分析。
JCO Clin Cancer Inform. 2020 Feb;4:147-159. doi: 10.1200/CCI.19.00110.
9
Integration of Cancer Registry Data into the Text Information Extraction System: Leveraging the Structured Data Import Tool.将癌症登记数据整合到文本信息提取系统中:利用结构化数据导入工具。
J Pathol Inform. 2018 Dec 24;9:47. doi: 10.4103/jpi.jpi_38_18. eCollection 2018.
10
An Innovative Approach to Improve Completeness of Treatment and Other Key Data Elements in a Population-Based Cancer Registry: A15-Month Data Submission.一种提高基于人群的癌症登记处治疗完整性及其他关键数据元素的创新方法:为期15个月的数据提交
J Registry Manag. 2017 Summer;44(2):69-75.

引用本文的文献

1
Improving Biomedical Knowledge Graph Quality: A Community Approach.提升生物医学知识图谱质量:一种社区方法。
ArXiv. 2025 Aug 29:arXiv:2508.21774v1.
2
The Impact and Molecular Mechanisms of Exercise in Cancer Therapy.运动在癌症治疗中的影响及分子机制
Curr Issues Mol Biol. 2025 May 20;47(5):374. doi: 10.3390/cimb47050374.
3
Deep Learning in Digital Breast Tomosynthesis: Current Status, Challenges, and Future Trends.数字乳腺断层合成中的深度学习:现状、挑战与未来趋势。
MedComm (2020). 2025 Jun 9;6(6):e70247. doi: 10.1002/mco2.70247. eCollection 2025 Jun.
4
The SPHN Schema Forge - transform healthcare semantics from human-readable to machine-readable by leveraging semantic web technologies.SPHN 模式生成器 - 通过利用语义网技术将医疗保健语义从人类可读转换为机器可读。
J Biomed Semantics. 2025 May 8;16(1):9. doi: 10.1186/s13326-025-00330-9.
5
A novel approach for target deconvolution from phenotype-based screening using knowledge graph.一种使用知识图谱从基于表型的筛选中进行靶点反卷积的新方法。
Sci Rep. 2025 Jan 18;15(1):2414. doi: 10.1038/s41598-025-86166-w.
6
A review of feature selection strategies utilizing graph data structures and Knowledge Graphs.利用图数据结构和知识图进行特征选择策略的综述。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae521.
7
Enhancing ophthalmology medical record management with multi-modal knowledge graphs.多模态知识图谱增强眼科病历管理。
Sci Rep. 2024 Oct 5;14(1):23221. doi: 10.1038/s41598-024-73316-9.
8
Large language model answers medical questions about standard pathology reports.大型语言模型回答有关标准病理报告的医学问题。
Front Med (Lausanne). 2024 Sep 18;11:1402457. doi: 10.3389/fmed.2024.1402457. eCollection 2024.
9
Cognitive Computing-Based CDSS in Medical Practice.医学实践中基于认知计算的临床决策支持系统
Health Data Sci. 2021 Jul 22;2021:9819851. doi: 10.34133/2021/9819851. eCollection 2021.
10
Construction of a knowledge graph for breast cancer diagnosis based on Chinese electronic medical records: development and usability study.基于中文电子病历构建乳腺癌诊断知识图谱:开发与可用性研究。
BMC Med Inform Decis Mak. 2023 Oct 10;23(1):210. doi: 10.1186/s12911-023-02322-0.

本文引用的文献

1
EpiK: A Knowledge Base for Epidemiological Modeling and Analytics of Infectious Diseases.EpiK:传染病流行病学建模与分析知识库。
J Healthc Inform Res. 2017 Nov 6;1(2):260-303. doi: 10.1007/s41666-017-0010-9. eCollection 2017 Dec.
2
The global, regional, and national burden of pancreatic cancer and its attributable risk factors in 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017.195 个国家和地区 1990-2017 年胰腺癌的全球、区域和国家负担及其可归因危险因素:2017 年全球疾病负担研究的系统分析。
Lancet Gastroenterol Hepatol. 2019 Dec;4(12):934-947. doi: 10.1016/S2468-1253(19)30347-4. Epub 2019 Oct 21.
3
The global burden of childhood and adolescent cancer in 2017: an analysis of the Global Burden of Disease Study 2017.2017 年全球儿童和青少年癌症负担:基于 2017 年全球疾病负担研究的分析。
Lancet Oncol. 2019 Sep;20(9):1211-1225. doi: 10.1016/S1470-2045(19)30339-0. Epub 2019 Jul 29.
4
Neighborhood Social Determinants of Triple Negative Breast Cancer.三阴性乳腺癌的邻里社会决定因素
Front Public Health. 2019 Feb 18;7:18. doi: 10.3389/fpubh.2019.00018. eCollection 2019.
5
Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.全球癌症统计数据 2018:GLOBOCAN 对全球 185 个国家/地区 36 种癌症的发病率和死亡率的估计。
CA Cancer J Clin. 2018 Nov;68(6):394-424. doi: 10.3322/caac.21492. Epub 2018 Sep 12.
6
Analysis and visualization of disease courses in a semantically-enabled cancer registry.在语义增强型癌症登记处对疾病病程进行分析和可视化。
J Biomed Semantics. 2017 Sep 29;8(1):46. doi: 10.1186/s13326-017-0154-9.
7
Features of triple-negative breast cancer: Analysis of 38,813 cases from the national cancer database.三阴性乳腺癌的特征:来自国家癌症数据库的38813例病例分析。
Medicine (Baltimore). 2016 Aug;95(35):e4614. doi: 10.1097/MD.0000000000004614.
8
Elevated Resistin Gene Expression in African American Estrogen and Progesterone Receptor Negative Breast Cancer.非裔美国女性雌激素和孕激素受体阴性乳腺癌中抵抗素基因表达升高
PLoS One. 2016 Jun 17;11(6):e0157741. doi: 10.1371/journal.pone.0157741. eCollection 2016.
9
Association of race/ethnicity, socioeconomic status, and breast cancer subtypes in the National Cancer Data Base (2010-2011).国家癌症数据库(2010 - 2011年)中种族/民族、社会经济地位与乳腺癌亚型的关联
Breast Cancer Res Treat. 2014 Jun;145(3):753-63. doi: 10.1007/s10549-014-2976-9. Epub 2014 May 3.
10
The inevitable application of big data to health care.大数据在医疗保健领域的必然应用。
JAMA. 2013 Apr 3;309(13):1351-2. doi: 10.1001/jama.2013.393.