Suppr超能文献

医学中生成式人工智能与非生成式预测分析机器学习的统计数据

Statistics of Generative Artificial Intelligence and Nongenerative Predictive Analytics Machine Learning in Medicine.

作者信息

Rashidi Hooman H, Hu Bo, Pantanowitz Joshua, Tran Nam, Liu Silvia, Chamanzar Alireza, Gur Mert, Chang Chung-Chou H, Wang Yanshan, Tafti Ahmad, Pantanowitz Liron, Hanna Matthew G

机构信息

Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Computational Pathology and AI Center of Excellence, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania.

Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio.

出版信息

Mod Pathol. 2025 Mar;38(3):100663. doi: 10.1016/j.modpat.2024.100663. Epub 2024 Nov 22.

Abstract

The rapidly evolving landscape of artificial intelligence (AI) and machine learning (ML) in medicine has prompted medical professionals to increasingly familiarize themselves with related topics. This also demands grasping the underlying statistical principles that govern their design, validation, and reproducibility. Uniquely, the practice of pathology and medicine produces vast amount of data that can be exploited by AI/ML. The emergence of generative AI, especially in the area of large language models and multimodal frameworks, represents approaches that are starting to transform medicine. Fundamentally, generative and traditional (eg, nongenerative predictive analytics) ML techniques rely on certain common statistical measures to function. However, unique to generative AI are metrics such as, but not limited to, perplexity and BiLingual Evaluation Understudy score that provide a means to determine the quality of generated samples that are typically unfamiliar to most medical practitioners. In contrast, nongenerative predictive analytics ML often uses more familiar metrics tailored to specific tasks as seen in the typical classification (ie, confusion metrics measures, such as accuracy, sensitivity, F1 score, and receiver operating characteristic area under the curve) or regression studies (ie, root mean square error and R). To this end, the goal of this review article (as part 4 of our AI review series) is to provide an overview and a comparative measure of statistical measures and methodologies used in both generative AI and traditional (ie, nongenerative predictive analytics) ML fields along with their strengths and known limitations. By understanding their similarities and differences along with their respective applications, we will become better stewards of this transformative space, which ultimately enables us to better address our current and future needs and challenges in a more responsible and scientifically sound manner.

摘要

人工智能(AI)和机器学习(ML)在医学领域的快速发展态势,促使医学专业人员越来越熟悉相关主题。这也要求掌握指导其设计、验证和可重复性的基础统计原则。独特的是,病理学和医学实践产生了大量可被人工智能/机器学习利用的数据。生成式人工智能的出现,尤其是在大语言模型和多模态框架领域,代表了开始改变医学的方法。从根本上说,生成式和传统(例如非生成式预测分析)机器学习技术依赖某些共同的统计措施来发挥作用。然而,生成式人工智能独有的指标包括但不限于困惑度和双语评估替补分数,这些指标提供了一种手段来确定生成样本的质量,而大多数医学从业者通常对这些指标并不熟悉。相比之下,非生成式预测分析机器学习通常使用更熟悉的、针对特定任务定制的指标,如典型分类(即混淆指标,如准确率、灵敏度、F1分数和曲线下面积)或回归研究(即均方根误差和R)中所见。为此,本文(作为我们人工智能综述系列的第4部分)的目标是概述并比较生成式人工智能和传统(即非生成式预测分析)机器学习领域中使用的统计措施和方法,以及它们的优势和已知局限性。通过了解它们的异同及其各自的应用,我们将成为这个变革性领域更好的管理者,最终使我们能够以更负责任和科学合理的方式更好地应对当前和未来的需求与挑战。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验