Wang Xiaoyong, Teng Pangyu, Ontiveros Ashley, Goldin Jonathan G, Brown Matthew S
University of California, Los Angeles, Center for Computer Vision and Imaging Biomarkers, Los Angeles, California, United States.
University of California, Los Angeles, Department of Radiological Sciences, Los Angeles, California, United States.
J Med Imaging (Bellingham). 2020 Mar;7(2):024501. doi: 10.1117/1.JMI.7.2.024501. Epub 2020 Mar 20.
When mining image data from PACs or clinical trials or processing large volumes of data without curation, the relevant scans must be identified among irrelevant or redundant data. Only images acquired with appropriate technical factors, patient positioning, and physiological conditions may be applicable to a particular image processing or machine learning task. Automatic labeling is important to make big data mining practical by replacing conventional manual review of every single-image series. Digital imaging and communications in medicine headers usually do not provide all the necessary labels and are sometimes incorrect. We propose an image-based high throughput labeling pipeline using deep learning, aimed at identifying scan direction, scan posture, lung coverage, contrast usage, and breath-hold types. They were posed as different classification problems and some of them involved further segmentation and identification of anatomic landmarks. Images of different view planes were used depending on the specific classification problem. All of our models achieved accuracy on test set across different tasks using a research database from multicenter clinical trials.
从PACS或临床试验中挖掘图像数据,或在未经整理的情况下处理大量数据时,必须在无关或冗余数据中识别出相关扫描。只有在适当的技术因素、患者体位和生理条件下采集的图像,才可能适用于特定的图像处理或机器学习任务。自动标注对于通过取代对每个图像系列的传统人工审核来实现大数据挖掘至关重要。医学数字成像和通信头部通常不提供所有必要的标签,有时还会出错。我们提出了一种基于深度学习的高通量图像标注流程,旨在识别扫描方向、扫描姿势、肺部覆盖范围、造影剂使用情况和屏气类型。它们被设置为不同的分类问题,其中一些还涉及进一步的解剖标志分割和识别。根据具体的分类问题使用不同视图平面的图像。我们所有的模型在使用来自多中心临床试验的研究数据库对不同任务的测试集上都达到了准确率。