Hasei Joe, Nakahara Ryuichi, Otsuka Yujiro, Nakamura Yusuke, Ikuta Kunihiro, Osaki Shuhei, Hironari Tamiya, Miwa Shinji, Ohshika Shusa, Nishimura Shunji, Kahara Naoaki, Yoshida Aki, Fujiwara Tomohiro, Nakata Eiji, Kunisada Toshiyuki, Ozaki Toshifumi
Department of Medical Information and Assistive Technology Development, Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Okayama 700-8558, Japan.
Science of Functional Recovery and Reconstruction, Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Okayama 700-8558, Japan.
Cancers (Basel). 2024 Dec 25;17(1):29. doi: 10.3390/cancers17010029.
: Developing high-performance artificial intelligence (AI) models for rare diseases is challenging owing to limited data availability. This study aimed to evaluate whether a novel three-class annotation method for preparing training data could enhance AI model performance in detecting osteosarcoma on plain radiographs compared to conventional single-class annotation. : We developed two annotation methods for the same dataset of 468 osteosarcoma X-rays and 378 normal radiographs: a conventional single-class annotation (1C model) and a novel three-class annotation method (3C model) that separately labeled intramedullary, cortical, and extramedullary tumor components. Both models used identical U-Net-based architectures, differing only in their annotation approaches. Performance was evaluated using an independent validation dataset. : Although both models achieved high diagnostic accuracy (AUC: 0.99 vs. 0.98), the 3C model demonstrated superior operational characteristics. At a standardized cutoff value of 0.2, the 3C model maintained balanced performance (sensitivity: 93.28%, specificity: 92.21%), whereas the 1C model showed compromised specificity (83.58%) despite high sensitivity (98.88%). Notably, at the 25th percentile threshold, both models showed identical false-negative rates despite significantly different cutoff values (3C: 0.661 vs. 1C: 0.985), indicating the ability of the 3C model to maintain diagnostic accuracy at substantially lower thresholds. : This study demonstrated that anatomically informed three-class annotation can enhance AI model performance for rare disease detection without requiring additional training data. The improved stability at lower thresholds suggests that thoughtful annotation strategies can optimize the AI model training, particularly in contexts where training data are limited.
由于可用数据有限,开发用于罕见病的高性能人工智能(AI)模型具有挑战性。本研究旨在评估一种用于准备训练数据的新型三类注释方法与传统的单类注释相比,是否能提高AI模型在普通X线片上检测骨肉瘤的性能。:我们针对包含468张骨肉瘤X线片和378张正常X线片的同一数据集开发了两种注释方法:传统的单类注释(1C模型)和一种新型的三类注释方法(3C模型),该方法分别标记骨髓内、皮质和骨髓外肿瘤成分。两种模型都使用相同的基于U-Net的架构,仅注释方法不同。使用独立验证数据集评估性能。:虽然两种模型都达到了较高的诊断准确性(AUC:0.99对0.98),但3C模型表现出更优的操作特征。在标准化截止值为0.2时,3C模型保持了平衡的性能(敏感性:93.28%,特异性:92.21%),而1C模型尽管敏感性高(98.88%),但特异性受损(83.58%)。值得注意的是,在第25百分位数阈值时,尽管截止值有显著差异(3C:0.661对1C:0.985),两种模型的假阴性率相同,这表明3C模型能够在低得多的阈值下保持诊断准确性。:本研究表明,解剖学信息指导的三类注释可以提高AI模型在罕见病检测中的性能,而无需额外的训练数据。在较低阈值下提高的稳定性表明,深思熟虑的注释策略可以优化AI模型训练,特别是在训练数据有限的情况下。