放射学自由文本数据中的先进采样技术，用于通过深度学习在椎体骨折中高效构建文本挖掘模型。

Advanced Sampling Technique in Radiology Free-Text Data for Efficiently Building Text Mining Models by Deep Learning in Vertebral Fracture.

作者信息

Hung Wei-Chieh, Lin Yih-Lon, Lin Chi-Wei, Chin Wei-Leng, Wu Chih-Hsing

机构信息

Department of Family and Community Medicine, E-Da Hospital, I-Shou University, Kaohsiung 82445, Taiwan.

School of Medicine, I-Shou University, Kaohsiung 84001, Taiwan.

出版信息

Diagnostics (Basel). 2024 Jan 8;14(2):137. doi: 10.3390/diagnostics14020137.

DOI:10.3390/diagnostics14020137

PMID:38248014

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10814913/

Abstract

This study aims to establish advanced sampling methods in free-text data for efficiently building semantic text mining models using deep learning, such as identifying vertebral compression fracture (VCF) in radiology reports. We enrolled a total of 27,401 radiology free-text reports of X-ray examinations of the spine. The predictive effects were compared between text mining models built using supervised long short-term memory networks, independently derived by four sampling methods: vector sum minimization, vector sum maximization, stratified, and simple random sampling, using four fixed percentages. The drawn samples were applied to the training set, and the remaining samples were used to validate each group using different sampling methods and ratios. The predictive accuracy was measured using the area under the receiver operating characteristics (AUROC) to identify VCF. At the sampling ratios of 1/10, 1/20, 1/30, and 1/40, the highest AUROC was revealed in the sampling methods of vector sum minimization as confidence intervals of 0.981 (95%CIs: 0.980-0.983)/0.963 (95%CIs: 0.961-0.965)/0.907 (95%CIs: 0.904-0.911)/0.895 (95%CIs: 0.891-0.899), respectively. The lowest AUROC was demonstrated in the vector sum maximization. This study proposes an advanced sampling method, vector sum minimization, in free-text data that can be efficiently applied to build the text mining models by smartly drawing a small amount of critical representative samples.

摘要

本研究旨在建立自由文本数据中的先进采样方法，以便使用深度学习高效构建语义文本挖掘模型，例如在放射学报告中识别椎体压缩性骨折（VCF）。我们纳入了总共27401份脊柱X线检查的放射学自由文本报告。使用四种固定百分比，通过四种采样方法独立推导，比较了使用监督长短期记忆网络构建的文本挖掘模型之间的预测效果：向量和最小化、向量和最大化、分层抽样和简单随机抽样。抽取的样本应用于训练集，其余样本用于使用不同的采样方法和比例验证每组。使用受试者工作特征曲线下面积（AUROC）测量预测准确性以识别VCF。在1/10、1/20、1/30和1/40的采样率下，向量和最小化采样方法的AUROC最高，置信区间分别为0.981（95%CI：0.980 - 0.983）/0.963（95%CI：0.961 - 0.965）/0.907（95%CI：0.904 - 0.911）/0.895（95%CI：0.891 - 0.899）。向量和最大化的AUROC最低。本研究提出了一种自由文本数据中的先进采样方法——向量和最小化，通过巧妙抽取少量关键代表性样本，可有效应用于构建文本挖掘模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2123/10814913/0552a9753070/diagnostics-14-00137-g001.jpg

相似文献

Advanced Sampling Technique in Radiology Free-Text Data for Efficiently Building Text Mining Models by Deep Learning in Vertebral Fracture.放射学自由文本数据中的先进采样技术，用于通过深度学习在椎体骨折中高效构建文本挖掘模型。

Diagnostics (Basel). 2024 Jan 8;14(2):137. doi: 10.3390/diagnostics14020137.

Identification of Patients with Osteoporotic Vertebral Fractures via Simple Text Search of Routine Radiology Reports.通过对常规放射学报告进行简单的文本搜索来识别骨质疏松性椎体骨折患者。

Calcif Tissue Int. 2019 Aug;105(2):156-160. doi: 10.1007/s00223-019-00557-6. Epub 2019 Apr 29.

Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports.自动预测主要心血管事件复发：使用胸部 X 光报告进行的文本挖掘研究。

J Healthc Eng. 2021 Jul 9;2021:6663884. doi: 10.1155/2021/6663884. eCollection 2021.

Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula.自动化自由文本放射学报告分类：使用不同的特征提取方法识别腓骨远端骨折。

Rofo. 2023 Aug;195(8):713-719. doi: 10.1055/a-2061-6562. Epub 2023 May 9.

Artificial Intelligence-Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study.基于人工智能的手术部位感染多模态风险评估模型（AMRAMS）：开发与验证研究

JMIR Med Inform. 2020 Jun 15;8(6):e18186. doi: 10.2196/18186.

Weakly supervised deep learning for diagnosis of multiple vertebral compression fractures in CT.基于弱监督学习的 CT 多椎体压缩性骨折诊断方法研究

Eur Radiol. 2024 Jun;34(6):3750-3760. doi: 10.1007/s00330-023-10394-9. Epub 2023 Nov 16.

Deep learning-based detection of patients with bone metastasis from Japanese radiology reports.基于深度学习的日本放射学报告中骨转移患者的检测。

Jpn J Radiol. 2023 Aug;41(8):900-908. doi: 10.1007/s11604-023-01413-2. Epub 2023 Mar 29.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Diagnosis of osteoporotic vertebral compression fractures and fracture level detection using multitask learning with U-Net in lumbar spine lateral radiographs.基于腰椎侧位X线片，利用U-Net多任务学习诊断骨质疏松性椎体压缩骨折及骨折节段检测

Comput Struct Biotechnol J. 2023 Jun 27;21:3452-3458. doi: 10.1016/j.csbj.2023.06.017. eCollection 2023.

Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources.通过文本挖掘电子医院记录自动对疾病入院情况进行分类：衡量链接数据源的影响。

J Biomed Inform. 2016 Dec;64:158-167. doi: 10.1016/j.jbi.2016.10.008. Epub 2016 Oct 11.

本文引用的文献

Extracting clinical terms from radiology reports with deep learning.深度学习从放射学报告中提取临床术语。

J Biomed Inform. 2021 Apr;116:103729. doi: 10.1016/j.jbi.2021.103729. Epub 2021 Mar 9.

Current Challenges and Barriers to Real-World Artificial Intelligence Adoption for the Healthcare System, Provider, and the Patient.医疗系统、医疗服务提供者及患者在实际应用人工智能方面当前面临的挑战与障碍

Transl Vis Sci Technol. 2020 Aug 11;9(2):45. doi: 10.1167/tvst.9.2.45. eCollection 2020 Aug.

Approaches to text mining for analyzing treatment plan of quit smoking with free-text medical records: A PRISMA-compliant meta-analysis.利用自由文本医疗记录分析戒烟治疗方案的文本挖掘方法：一项遵循PRISMA标准的荟萃分析。

Medicine (Baltimore). 2020 Jul 17;99(29):e20999. doi: 10.1097/MD.0000000000020999.

Use of electronic medical records in development and validation of risk prediction models of hospital readmission: systematic review.电子病历在医院再入院风险预测模型的开发和验证中的应用：系统评价。

BMJ. 2020 Apr 8;369:m958. doi: 10.1136/bmj.m958.

Revisit three "I" model: a novel five "I" model of fracture liaison service.重新审视三“我”模式：骨折联络服务的新型五“我”模式。

Osteoporos Int. 2019 Nov;30(11):2361-2362. doi: 10.1007/s00198-019-05090-8. Epub 2019 Sep 11.

Data Processing and Text Mining Technologies on Electronic Medical Records: A Review.电子病历的数据处理和文本挖掘技术：综述。

J Healthc Eng. 2018 Apr 8;2018:4302425. doi: 10.1155/2018/4302425. eCollection 2018.

Artificial Intelligence in Medical Practice: The Question to the Answer?人工智能在医疗实践中的应用：问题的答案？

Am J Med. 2018 Feb;131(2):129-133. doi: 10.1016/j.amjmed.2017.10.035. Epub 2017 Nov 7.

Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets.检测代表性数据并生成合成样本以提高不平衡数据集的学习准确性。

PLoS One. 2017 Aug 3;12(8):e0181853. doi: 10.1371/journal.pone.0181853. eCollection 2017.

Identification of Long Bone Fractures in Radiology Reports Using Natural Language Processing to support Healthcare Quality Improvement.利用自然语言处理技术识别放射学报告中的长骨骨折以支持医疗质量改进

Appl Clin Inform. 2016 Nov 9;7(4):1051-1068. doi: 10.4338/ACI-2016-08-RA-0129.

EHR Adoption and Hospital Performance: Time-Related Effects.电子健康记录的采用与医院绩效：与时间相关的影响。

Health Serv Res. 2015 Dec;50(6):1751-71. doi: 10.1111/1475-6773.12406. Epub 2015 Oct 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

放射学自由文本数据中的先进采样技术，用于通过深度学习在椎体骨折中高效构建文本挖掘模型。

Advanced Sampling Technique in Radiology Free-Text Data for Efficiently Building Text Mining Models by Deep Learning in Vertebral Fracture.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献