Suppr超能文献

多步骤自动化数据标注程序(MADLaP)在超声甲状腺结节中的应用:一种自动化图像标注的人工智能方法。

Multistep Automated Data Labelling Procedure (MADLaP) for thyroid nodules on ultrasound: An artificial intelligence approach for automating image annotation.

机构信息

Department of Electrical and Computer Engineering, Duke University, Room 10070, 2424 Erwin Rd, Durham, NC 27705, United States.

Department of Radiology, Duke University Medical Center, Durham, NC, United States; Department of Electrical and Computer Engineering, Department of Biostatistics and Bioinformatics, Department of Computer Science, Duke University, Room 9044, 2424 Erwin Rd, Durham, NC 27705, United States.

出版信息

Artif Intell Med. 2023 Jul;141:102553. doi: 10.1016/j.artmed.2023.102553. Epub 2023 Apr 22.

Abstract

Machine learning (ML) for diagnosis of thyroid nodules on ultrasound is an active area of research. However, ML tools require large, well-labeled datasets, the curation of which is time-consuming and labor-intensive. The purpose of our study was to develop and test a deep-learning-based tool to facilitate and automate the data annotation process for thyroid nodules; we named our tool Multistep Automated Data Labelling Procedure (MADLaP). MADLaP was designed to take multiple inputs including pathology reports, ultrasound images, and radiology reports. Using multiple step-wise 'modules' including rule-based natural language processing, deep-learning-based imaging segmentation, and optical character recognition, MADLaP automatically identified images of a specific thyroid nodule and correctly assigned a pathology label. The model was developed using a training set of 378 patients across our health system and tested on a separate set of 93 patients. Ground truths for both sets were selected by an experienced radiologist. Performance metrics including yield (how many labeled images the model produced) and accuracy (percentage correct) were measured using the test set. MADLaP achieved a yield of 63 % and an accuracy of 83 %. The yield progressively increased as the input data moved through each module, while accuracy peaked part way through. Error analysis showed that inputs from certain examination sites had lower accuracy (40 %) than the other sites (90 %, 100 %). MADLaP successfully created curated datasets of labeled ultrasound images of thyroid nodules. While accurate, the relatively suboptimal yield of MADLaP exposed some challenges when trying to automatically label radiology images from heterogeneous sources. The complex task of image curation and annotation could be automated, allowing for enrichment of larger datasets for use in machine learning development.

摘要

基于机器学习(ML)的甲状腺结节超声诊断是一个活跃的研究领域。然而,ML 工具需要大型、标记良好的数据集,这些数据集的整理既费时又费力。我们研究的目的是开发和测试一种基于深度学习的工具,以促进和自动化甲状腺结节的数据标注过程;我们将该工具命名为多步骤自动数据标注程序(MADLaP)。MADLaP 旨在接收多个输入,包括病理报告、超声图像和放射学报告。通过使用多个逐步的“模块”,包括基于规则的自然语言处理、基于深度学习的成像分割和光学字符识别,MADLaP 可以自动识别特定甲状腺结节的图像,并正确分配病理标签。该模型是使用我们医疗系统中的 378 名患者的训练集开发的,并在 93 名患者的独立测试集上进行了测试。两个数据集的真实情况都是由一位有经验的放射科医生选择的。使用测试集测量了包括产量(模型生成的标记图像数量)和准确性(正确百分比)在内的性能指标。MADLaP 的产量为 63%,准确率为 83%。随着输入数据通过每个模块移动,产量逐渐增加,而准确性在中途达到峰值。误差分析表明,来自某些检查部位的输入比其他部位(90%、100%)的准确性低(40%)。MADLaP 成功创建了甲状腺结节超声图像的有标注数据集。尽管准确率较高,但 MADLaP 相对较低的产量在试图自动标记来自异构源的放射学图像时暴露了一些挑战。图像整理和注释的复杂任务可以实现自动化,从而丰富更大的数据集,用于机器学习的开发。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验