Bai Xiang, Wang Hanchen, Ma Liya, Xu Yongchao, Gan Jiefeng, Fan Ziwei, Yang Fan, Ma Ke, Yang Jiehua, Bai Song, Shu Chang, Zou Xinyu, Huang Renhao, Zhang Changzheng, Liu Xiaowu, Tu Dandan, Xu Chuou, Zhang Wenqing, Wang Xi, Chen Anguo, Zeng Yu, Yang Dehua, Wang Ming-Wei, Holalkere Nagaraj, Halin Neil J, Kamel Ihab R, Wu Jia, Peng Xuehua, Wang Xiang, Shao Jianbo, Mongkolwat Pattanasak, Zhang Jianjun, Liu Weiyang, Roberts Michael, Teng Zhongzhao, Beer Lucian, Escudero Sanchez Lorena, Sala Evis, Rubin Daniel, Weller Adrian, Lasenby Joan, Zheng Chuangsheng, Wang Jianming, Li Zhen, Schönlieb Carola-Bibiane, Xia Tian
Department of Radiology, Tongji Hospital and Medical College, Huazhong University of Science and Technology, Wuhan, China.
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China.
ArXiv. 2021 Nov 18:arXiv:2111.09461v1.
Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses. However, concerns surrounding security and trustworthiness impede the collection of large-scale representative medical data, posing a considerable challenge for training a well-generalised model in clinical practices. To address this, we launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution under a federated learning framework (FL) without data sharing. Here we show that our FL model outperformed all the local models by a large yield (test sensitivity /specificity in China: 0.973/0.951, in the UK: 0.730/0.942), achieving comparable performance with a panel of professional radiologists. We further evaluated the model on the hold-out (collected from another two hospitals leaving out the FL) and heterogeneous (acquired with contrast materials) data, provided visual explanations for decisions made by the model, and analysed the trade-offs between the model performance and the communication costs in the federated training process. Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK. Collectively, our work advanced the prospects of utilising federated learning for privacy-preserving AI in digital health.
人工智能(AI)为简化新型冠状病毒肺炎(COVID-19)诊断提供了一种很有前景的替代方案。然而,围绕安全性和可信度的担忧阻碍了大规模代表性医学数据的收集,这对在临床实践中训练一个具有良好泛化能力的模型构成了相当大的挑战。为了解决这个问题,我们发起了统一CT-COVID人工智能诊断计划(UCADI),在联邦学习框架(FL)下,人工智能模型可以在每个主机机构进行分布式训练和独立执行,而无需数据共享。在此,我们表明我们的联邦学习模型比所有本地模型都有大幅提升(在中国的测试敏感度/特异度:0.973/0.951,在英国:0.730/0.942),与一组专业放射科医生的表现相当。我们进一步在保留数据(从另外两家未参与联邦学习的医院收集)和异质数据(使用造影剂获取)上评估了该模型,为模型做出的决策提供了可视化解释,并分析了联邦训练过程中模型性能与通信成本之间的权衡。我们的研究基于从中国和英国的23家医院收集的3336例患者的9573份胸部计算机断层扫描(CT)。总体而言,我们的工作推动了在数字健康领域利用联邦学习实现隐私保护人工智能的前景。