Ramwala Ojas A, Lowry Kathryn P, Hippe Daniel S, Unrath Matthew P N, Nyflot Matthew J, Mooney Sean D, Lee Christoph I
Department of Biomedical Informatics and Medical Education, University of Washington School of Medicine, Seattle, Washington, 98195, USA.
Department of Radiology, University of Washington School of Medicine, Seattle, Washington, 98195, USA.
Pac Symp Biocomput. 2025;30:215-228. doi: 10.1142/9789819807024_0016.
Artificial Intelligence (AI) algorithms showcase the potential to steer a paradigm shift in clinical medicine, especially medical imaging. Concerns associated with model generalizability and biases necessitate rigorous external validation of AI algorithms prior to their adoption into clinical workflows. To address the barriers associated with patient privacy, intellectual property, and diverse model requirements, we introduce ClinValAI, a framework for establishing robust cloud-based infrastructures to clinically validate AI algorithms in medical imaging. By featuring dedicated workflows for data ingestion, algorithm scoring, and output processing, we propose an easily customizable method to assess AI models and investigate biases. Our novel orchestration mechanism facilitates utilizing the complete potential of the cloud computing environment. ClinValAI's input auditing and standardization mechanisms ensure that inputs consistent with model prerequisites are provided to the algorithm for a streamlined validation. The scoring workflow comprises multiple steps to facilitate consistent inferencing and systematic troubleshooting. The output processing workflow helps identify and analyze samples with missing results and aggregates final outputs for downstream analysis. We demonstrate the usability of our work by evaluating a state-of-the-art breast cancer risk prediction algorithm on a large and diverse dataset of 2D screening mammograms. We perform comprehensive statistical analysis to study model calibration and evaluate performance on important factors, including breast density, age, and race, to identify latent biases. ClinValAI provides a holistic framework to validate medical imaging models and has the potential to advance the development of generalizable AI models in clinical medicine and promote health equity.
人工智能(AI)算法展现出推动临床医学范式转变的潜力,尤其是在医学成像领域。与模型通用性和偏差相关的问题使得在将AI算法应用于临床工作流程之前,必须对其进行严格的外部验证。为了解决与患者隐私、知识产权以及多样的模型要求相关的障碍,我们引入了ClinValAI,这是一个用于建立强大的基于云的基础设施,以对医学成像中的AI算法进行临床验证的框架。通过具有专门的数据摄取、算法评分和输出处理工作流程,我们提出了一种易于定制的方法来评估AI模型并调查偏差。我们新颖的编排机制有助于充分利用云计算环境的全部潜力。ClinValAI的输入审核和标准化机制可确保将与模型前提条件一致的输入提供给算法,以实现简化的验证。评分工作流程包括多个步骤,以促进一致的推理和系统的故障排除。输出处理工作流程有助于识别和分析结果缺失的样本,并汇总最终输出以进行下游分析。我们通过在一个包含大量不同二维筛查乳房X光照片的数据集上评估一种先进的乳腺癌风险预测算法,展示了我们工作的实用性。我们进行全面的统计分析,以研究模型校准,并评估在包括乳房密度、年龄和种族等重要因素上的性能,以识别潜在偏差。ClinValAI提供了一个全面的框架来验证医学成像模型,并且有潜力推动临床医学中通用AI模型的发展并促进健康公平。