• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

开发并实现一种用于从电子健康记录中捕获 IV 期非小细胞肺癌患者疾病进展情况的文本挖掘算法。

Development and Portability of a Text Mining Algorithm for Capturing Disease Progression in Electronic Health Records of Patients With Stage IV Non-Small Cell Lung Cancer.

机构信息

Department of Clinical Pharmacy, St Antonius Hospital, Utrecht, the Netherlands.

Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, the Netherlands.

出版信息

JCO Clin Cancer Inform. 2024 Oct;8:e2400053. doi: 10.1200/CCI.24.00053. Epub 2024 Oct 4.

DOI:10.1200/CCI.24.00053
PMID:39365963
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11469628/
Abstract

PURPOSE

The objective was to develop and evaluate the portability of a text mining algorithm for prospectively capturing disease progression in electronic health record (EHR) data of patients with metastatic non-small cell lung cancer (mNSCLC) treated with immunochemotherapy.

METHODS

This study used EHR data from patients with mNSCLC receiving immunochemotherapy (between October 1, 2018, and December 31, 2022) in four Dutch hospitals. A text mining algorithm for capturing disease progression was developed in hospitals 1 and 2 and then transferred to hospitals 3 and 4 to evaluate portability. Performance metrics were calculated by comparing its outcomes with manual chart review. In addition, data were simulated to come available over time to assess performance in real-time applications. Median progression-free survival (PFS) was calculated using the Kaplan-Meier method to compare text mining with manual chart review.

RESULTS

During development and portability, the text mining algorithm performed well in capturing disease progression, with all performance scores >90%. When real-time performance was simulated, the performance scores in all four hospitals exceeded 90% from week 15 after the start of follow-up. Although the exact progression dates varied in 46 patients of 157 patients with progressive disease, the number of patients labeled with progression too early (n = 24) and too late (n = 22) was well balanced with discrepancies ranging from -116 to 384 days. Nevertheless, the PFS curves constructed with text mining and manual chart review were highly similar for each hospital.

CONCLUSION

In this study, an accurate text mining algorithm for capturing disease progression in the EHR data of patients with mNSCLC was developed. The algorithm was portable across different hospitals, and the performance over time was good, making this an interesting approach for prospective follow-up of multicenter cohorts.

摘要

目的

旨在开发和评估一种文本挖掘算法,用于前瞻性地捕捉接受免疫化疗的转移性非小细胞肺癌(mNSCLC)患者的电子健康记录(EHR)数据中的疾病进展。

方法

本研究使用了来自四家荷兰医院接受免疫化疗的 mNSCLC 患者的 EHR 数据(2018 年 10 月 1 日至 2022 年 12 月 31 日)。在医院 1 和 2 开发了一种用于捕捉疾病进展的文本挖掘算法,然后将其转移到医院 3 和 4 以评估可移植性。通过将其结果与手动图表审查进行比较来计算性能指标。此外,还模拟了数据随时间的可用性,以评估实时应用中的性能。使用 Kaplan-Meier 方法计算中位无进展生存期(PFS),以比较文本挖掘与手动图表审查。

结果

在开发和可移植性期间,文本挖掘算法在捕捉疾病进展方面表现良好,所有性能得分均>90%。当模拟实时性能时,在开始随访后的第 15 周,所有四个医院的性能得分均超过 90%。尽管在 157 例进展性疾病患者中有 46 例患者的具体进展日期有所不同,但标记为进展过早(n = 24)和过晚(n = 22)的患者数量差异很大,从 -116 到 384 天不等。尽管如此,使用文本挖掘和手动图表审查构建的 PFS 曲线在每个医院都非常相似。

结论

在这项研究中,开发了一种用于捕捉 mNSCLC 患者 EHR 数据中疾病进展的准确文本挖掘算法。该算法在不同医院之间具有可移植性,并且随时间的性能良好,这为前瞻性随访多中心队列提供了一种有趣的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/bd218b7d451a/cci-8-e2400053-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/aaa5f735848d/cci-8-e2400053-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/016f345582aa/cci-8-e2400053-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/bd218b7d451a/cci-8-e2400053-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/aaa5f735848d/cci-8-e2400053-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/016f345582aa/cci-8-e2400053-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9881/11469628/bd218b7d451a/cci-8-e2400053-g003.jpg

相似文献

1
Development and Portability of a Text Mining Algorithm for Capturing Disease Progression in Electronic Health Records of Patients With Stage IV Non-Small Cell Lung Cancer.开发并实现一种用于从电子健康记录中捕获 IV 期非小细胞肺癌患者疾病进展情况的文本挖掘算法。
JCO Clin Cancer Inform. 2024 Oct;8:e2400053. doi: 10.1200/CCI.24.00053. Epub 2024 Oct 4.
2
Characterizing the Feasibility and Performance of Real-World Tumor Progression End Points and Their Association With Overall Survival in a Large Advanced Non-Small-Cell Lung Cancer Data Set.在一个大型晚期非小细胞肺癌数据集中,表征真实世界肿瘤进展终点的可行性和性能及其与总生存期的关联。
JCO Clin Cancer Inform. 2019 Aug;3:1-13. doi: 10.1200/CCI.19.00013.
3
A text-mining approach to obtain detailed treatment information from free-text fields in population-based cancer registries: A study of non-small cell lung cancer in California.一种从基于人群的癌症登记处的自由文本字段中获取详细治疗信息的文本挖掘方法:加利福尼亚州非小细胞肺癌的研究。
PLoS One. 2019 Feb 22;14(2):e0212454. doi: 10.1371/journal.pone.0212454. eCollection 2019.
4
Correlation of K derived from dynamic contrast-enhanced MRI with treatment response and survival in locally advanced NSCLC patients undergoing induction immunochemotherapy and concurrent chemoradiotherapy.从接受诱导免疫化疗和同期放化疗的局部晚期 NSCLC 患者的动态对比增强 MRI 中得出的 K 值与治疗反应和生存的相关性。
J Immunother Cancer. 2024 Jun 23;12(6):e008574. doi: 10.1136/jitc-2023-008574.
5
Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.电子健康记录语料库中的冗余:分析、对文本挖掘性能的影响和缓解策略。
BMC Bioinformatics. 2013 Jan 16;14:10. doi: 10.1186/1471-2105-14-10.
6
Development and Validation of a Machine Learning Model to Explore Tyrosine Kinase Inhibitor Response in Patients With Stage IV EGFR Variant-Positive Non-Small Cell Lung Cancer.开发和验证一种机器学习模型,以探索 IV 期 EGFR 突变阳性非小细胞肺癌患者对酪氨酸激酶抑制剂的反应。
JAMA Netw Open. 2020 Dec 1;3(12):e2030442. doi: 10.1001/jamanetworkopen.2020.30442.
7
Impact of Targeted Therapy on the Survival of Patients With Advanced-Stage Non-small Cell Lung Cancer in Oncosalud - AUNA.靶向治疗对 Oncosalud-AUNA 晚期非小细胞肺癌患者生存的影响。
Cancer Control. 2022 Jan-Dec;29:10732748211068637. doi: 10.1177/10732748211068637.
8
Pre-treatment FDG-PET predicts the site of in-field progression following concurrent chemoradiotherapy for stage III non-small cell lung cancer.治疗前氟代脱氧葡萄糖正电子发射断层扫描可预测Ⅲ期非小细胞肺癌同步放化疗后瘤床内进展部位。
Lung Cancer. 2015 Jan;87(1):23-7. doi: 10.1016/j.lungcan.2014.10.016. Epub 2014 Nov 6.
9
Validation of an Updated Algorithm to Identify Patients With Incident Non-Small Cell Lung Cancer in Administrative Claims Databases.验证一种用于在行政索赔数据库中识别新发非小细胞肺癌患者的更新算法。
JCO Clin Cancer Inform. 2024 Mar;8:e2300165. doi: 10.1200/CCI.23.00165.
10
Text-mining in electronic healthcare records can be used as efficient tool for screening and data collection in cardiovascular trials: a multicenter validation study.电子医疗记录中的文本挖掘可以作为心血管试验中筛选和数据收集的有效工具:一项多中心验证研究。
J Clin Epidemiol. 2021 Apr;132:97-105. doi: 10.1016/j.jclinepi.2020.11.014. Epub 2020 Nov 25.

本文引用的文献

1
Development of an Automatic Rule-Based Algorithm for the Detection of Ovarian Cancer Recurrence From Electronic Health Records.基于规则的自动算法在电子病历中卵巢癌复发检测的开发。
JCO Clin Cancer Inform. 2024 Mar;8:e2300150. doi: 10.1200/CCI.23.00150.
2
Pembrolizumab Plus Chemotherapy Per PD-L1 Stratum In Patients With Metastatic Non-Small Cell Lung Cancer: Real-World Effectiveness Versus Trial Efficacy.帕博利珠单抗联合化疗按 PD-L1 分层治疗转移性非小细胞肺癌患者:真实世界疗效与试验疗效的比较。
Clin Lung Cancer. 2024 Mar;25(2):119-127.e1. doi: 10.1016/j.cllc.2023.12.011. Epub 2023 Dec 20.
3
ESMO Guidance for Reporting Oncology real-World evidence (GROW).
欧洲肿瘤内科学会(ESMO)肿瘤学真实世界证据报告指南(GROW)
Ann Oncol. 2023 Dec;34(12):1097-1112. doi: 10.1016/j.annonc.2023.10.001. Epub 2023 Oct 15.
4
Progression-free survival, disease-free survival and other composite end points in oncology: improved reporting is needed.无进展生存期、无疾病生存期和其他肿瘤学复合终点:需要改进报告。
Nat Rev Clin Oncol. 2023 Dec;20(12):885-895. doi: 10.1038/s41571-023-00823-5. Epub 2023 Oct 12.
5
Real-World Evidence in EU Medicines Regulation: Enabling Use and Establishing Value.欧盟药品监管中的真实世界证据:促进使用与确立价值
Clin Pharmacol Ther. 2022 Jan;111(1):21-23. doi: 10.1002/cpt.2479. Epub 2021 Nov 19.
6
An Electronic Health Record Text Mining Tool to Collect Real-World Drug Treatment Outcomes: A Validation Study in Patients With Metastatic Renal Cell Carcinoma.电子健康记录文本挖掘工具收集真实世界药物治疗结局:转移性肾细胞癌患者的验证研究。
Clin Pharmacol Ther. 2020 Sep;108(3):644-652. doi: 10.1002/cpt.1966. Epub 2020 Jul 18.
7
Characterizing the Feasibility and Performance of Real-World Tumor Progression End Points and Their Association With Overall Survival in a Large Advanced Non-Small-Cell Lung Cancer Data Set.在一个大型晚期非小细胞肺癌数据集中,表征真实世界肿瘤进展终点的可行性和性能及其与总生存期的关联。
JCO Clin Cancer Inform. 2019 Aug;3:1-13. doi: 10.1200/CCI.19.00013.
8
An Exploratory Analysis of Real-World End Points for Assessing Outcomes Among Immunotherapy-Treated Patients With Advanced Non-Small-Cell Lung Cancer.评估晚期非小细胞肺癌免疫治疗患者结局的真实世界终点的探索性分析
JCO Clin Cancer Inform. 2019 Jul;3:1-15. doi: 10.1200/CCI.18.00155.
9
Pharmacotherapy within a learning healthcare system: rationale for the Dutch Santeon Farmadatabase.学习型医疗保健系统中的药物治疗:荷兰桑顿药物数据库的基本原理
Eur J Hosp Pharm. 2019 Jan;26(1):46-50. doi: 10.1136/ejhpharm-2017-001329. Epub 2017 Sep 18.
10
Generating Real-World Tumor Burden Endpoints from Electronic Health Record Data: Comparison of RECIST, Radiology-Anchored, and Clinician-Anchored Approaches for Abstracting Real-World Progression in Non-Small Cell Lung Cancer.从电子健康记录数据中生成真实世界的肿瘤负担终点:RECIST、放射学锚定和临床医生锚定方法在非小细胞肺癌真实世界进展中的比较。
Adv Ther. 2019 Aug;36(8):2122-2136. doi: 10.1007/s12325-019-00970-1. Epub 2019 May 28.