Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
Department of Ophthalmology, Sao Paulo Federal University, Sao Paulo, Brazil.
BMJ Open Ophthalmol. 2023 Aug;8(1). doi: 10.1136/bmjophth-2022-001216.
Retinopathy of prematurity (ROP) is a vasoproliferative disease responsible for more than 30 000 blind children worldwide. Its diagnosis and treatment are challenging due to the lack of specialists, divergent diagnostic concordance and variation in classification standards. While artificial intelligence (AI) can address the shortage of professionals and provide more cost-effective management, its development needs fairness, generalisability and bias controls prior to deployment to avoid producing harmful unpredictable results. This review aims to compare AI and ROP study's characteristics, fairness and generalisability efforts.
Our review yielded 220 articles, of which 18 were included after full-text assessment. The articles were classified into ROP severity grading, plus detection, detecting treatment requiring, ROP prediction and detection of retinal zones.
All the article's authors and included patients are from middle-income and high-income countries, with no low-income countries, South America, Australia and Africa Continents representation.Code is available in two articles and in one on request, while data are not available in any article. 88.9% of the studies use the same retinal camera. In two articles, patients' sex was described, but none applied a bias control in their models.
The reviewed articles included 180 228 images and reported good metrics, but fairness, generalisability and bias control remained limited. Reproducibility is also a critical limitation, with few articles sharing codes and none sharing data. Fair and generalisable ROP and AI studies are needed that include diverse datasets, data and code sharing, collaborative research, and bias control to avoid unpredictable and harmful deployments.
早产儿视网膜病变(ROP)是一种血管增生性疾病,在全球导致超过 3 万名儿童失明。由于缺乏专业人员、诊断一致性存在差异以及分类标准不同,ROP 的诊断和治疗极具挑战性。虽然人工智能(AI)可以解决专业人员短缺的问题,并提供更具成本效益的管理,但在部署之前,需要对其进行公平性、泛化能力和偏差控制,以避免产生不可预测的有害结果。本综述旨在比较 AI 和 ROP 研究的特点、公平性和泛化能力的努力。
我们的综述共产生了 220 篇文章,其中 18 篇经过全文评估后被纳入。这些文章被分为 ROP 严重程度分级、外加检测、检测需要治疗、ROP 预测和检测视网膜区域。
所有文章的作者和纳入的患者均来自中高收入国家,没有来自低收入国家、南美洲、澳大利亚和非洲的代表。有两篇文章提供了代码,有一篇文章应要求提供了代码,而没有文章提供数据。88.9%的研究使用了相同的视网膜相机。有两篇文章描述了患者的性别,但没有一篇文章在其模型中应用了偏差控制。
综述中纳入的 180228 张图像报告了良好的指标,但公平性、泛化能力和偏差控制仍然有限。可重复性也是一个关键的限制因素,很少有文章共享代码,也没有文章共享数据。需要开展公平和可泛化的 ROP 和 AI 研究,包括使用多样化数据集、数据和代码共享、协作研究以及偏差控制,以避免不可预测和有害的部署。