Prince Eric W, Hankinson Todd C, Görg Carsten
Department of Neurosurgery, University of Colorado Anschutz Medical Campus, Aurora, Colorado 80045, USA,
Department of Neurosurgery, University of Colorado Anschutz Medical Campus, Aurora, Colorado 80045, USA.
Pac Symp Biocomput. 2025;30:40-53. doi: 10.1142/9789819807024_0004.
Human involvement remains critical in most instances of clinical decision-making. Recent advances in AI and machine learning opened the door for designing, implementing, and translating interactive AI systems to support clinicians in decision-making. Assessing the impact and implications of such systems on patient care and clinical workflows requires in-depth studies. Conducting evaluation studies of AI-supported interactive systems to support decision-making in clinical settings is challenging and time-consuming. These studies involve carefully collecting, analyzing, and interpreting quantitative and qualitative data to assess the performance of the underlying AI-supported system, its impact on the human decision-making process, and the implications for patient care. We have previously developed a toolkit for designing and implementing clinical AI software so that it can be subjected to an application-based evaluation. Here, we present a visual analytics frame-work for analyzing and interpreting the data collected during such an evaluation process. Our framework supports identifying subgroups of users and patients based on their characteristics, detecting outliers among them, and providing evidence to ensure adherence to regulatory guidelines. We used early-stage clinical AI regulatory guidelines to drive the system design, implemented multiple-factor analysis and hierarchical clustering as exemplary analysis tools, and provided interactive visualizations to explore and interpret results. We demonstrate the effectiveness of our framework through a case study to evaluate a prototype AI-based clinical decision-support system for diagnosing pediatric brain tumors.
在大多数临床决策案例中,人为参与仍然至关重要。人工智能和机器学习的最新进展为设计、实施和转化交互式人工智能系统以支持临床医生决策打开了大门。评估此类系统对患者护理和临床工作流程的影响及意义需要深入研究。在临床环境中开展对人工智能支持的交互式系统的评估研究具有挑战性且耗时。这些研究涉及仔细收集、分析和解释定量与定性数据,以评估基础人工智能支持系统的性能、其对人类决策过程的影响以及对患者护理的意义。我们之前开发了一个用于设计和实施临床人工智能软件的工具包,以便对其进行基于应用的评估。在此,我们展示一个可视化分析框架,用于分析和解释在这样一个评估过程中收集的数据。我们的框架支持根据用户和患者的特征识别亚组,检测其中的异常值,并提供证据以确保符合监管指南。我们使用早期临床人工智能监管指南来推动系统设计,实施多因素分析和层次聚类作为示例性分析工具,并提供交互式可视化来探索和解释结果。我们通过一个案例研究展示了我们框架的有效性,该案例研究旨在评估一个基于人工智能的用于诊断小儿脑肿瘤的临床决策支持系统原型。