Zhang Yaping, Liu Mingqian, Hu Shundong, Shen Yao, Lan Jun, Jiang Beibei, de Bock Geertruida H, Vliegenthart Rozemarijn, Chen Xu, Xie Xueqian
Radiology Department, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Haining Rd.100, Shanghai, 200080 China.
Radiology Department, Shanghai General Hospital of Nanjing Medical University, Haining Rd.100, Shanghai, 200080 China.
Commun Med (Lond). 2021 Oct 28;1:43. doi: 10.1038/s43856-021-00043-x. eCollection 2021.
Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centers.
We collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. Then, a 25-label classification system were built to train and test the CNN models with weakly supervised labeling.
In three external test cohorts of 5,996 symptomatic patients, 2,130 screening examinees, and 1,804 community clinic patients, the mean AUC of identifying 25 abnormal signs by CNN reaches 0.866 ± 0.110, 0.891 ± 0.147, and 0.796 ± 0.157, respectively. In symptomatic patients, CNN shows no significant difference with local radiologists in identifying 21 signs (p > 0.05), but is poorer for 4 signs (p < 0.05). In screening examinees, CNN shows no significant difference for 17 signs (p > 0.05), but is poorer at classifying nodules (p = 0.013). In community clinic patients, CNN shows no significant difference for 12 signs (p > 0.05), but performs better for 6 signs (p < 0.001).
We construct and validate an effective CXR interpretation system based on natural language processing.
人工智能可辅助解读胸部X线摄影(CXR)数据,但大型数据集需要高效的图像标注。本研究的目的是基于自然语言处理从诊断报告中提取CXR标签,训练卷积神经网络(CNN),并使用来自多个中心的CXR数据评估CNN的分类性能。
我们收集了74082名受试者的CXR图像及相应的放射学报告作为训练数据集。通过基于变换器的双向编码器表征(BERT)模型从未结构化的放射学报告中提取语言实体和关系,并构建知识图谱以表示异常征象的图像标签与CXR报告文本之间的关联。然后,构建一个25标签分类系统,用弱监督标注来训练和测试CNN模型。
在三个外部测试队列中,分别为5996名有症状患者、2130名筛查受检者和1804名社区诊所患者,CNN识别25种异常征象的平均AUC分别达到0.866±0.110、0.891±0.147和0.796±0.157。在有症状患者中,CNN在识别21种征象方面与当地放射科医生无显著差异(p>0.05),但在4种征象上表现较差(p<0.05)。在筛查受检者中,CNN在17种征象上无显著差异(p>0.05),但在结节分类方面较差(p=0.013)。在社区诊所患者中,CNN在12种征象上无显著差异(p>0.05),但在6种征象上表现更好(p<0.001)。
我们构建并验证了一个基于自然语言处理的有效的CXR解读系统。