研究人机协作协议：以放射学双人读片中的卡斯帕罗夫定律为例。

Studying human-AI collaboration protocols: the case of the Kasparov's law in radiological double reading.

作者信息

Cabitza Federico, Campagner Andrea, Sconfienza Luca Maria

机构信息

Università degli Studi di Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy.

Department of Biomedical Sciences for Health, University of Milan, Milan, Italy.

出版信息

Health Inf Sci Syst. 2021 Feb 5;9(1):8. doi: 10.1007/s13755-021-00138-8. eCollection 2021 Dec.

DOI:10.1007/s13755-021-00138-8

PMID:33585029

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7864624/

Abstract

PURPOSE

The integration of Artificial Intelligence into medical practices has recently been advocated for the promise to bring increased efficiency and effectiveness to these practices. Nonetheless, little research has so far been aimed at understanding the best human-AI interaction protocols in collaborative tasks, even in currently more viable settings, like independent double-reading screening tasks.

METHODS

To this aim, we report about a retrospective case-control study, involving 12 board-certified radiologists, in the detection of knee lesions by means of Magnetic Resonance Imaging, in which we simulated the serial combination of two Deep Learning models with humans in eight double-reading protocols. Inspired by the so-called Kasparov's Laws, we investigate whether the combination of humans and AI models could achieve better performance than AI models alone, and whether weak reader, when supported by fit-for-use interaction protocols, could out-perform stronger readers.

RESULTS

We discuss two main findings: groups of humans who perform significantly worse than a state-of-the-art AI can significantly outperform it if their judgements are aggregated by majority voting (in concordance with the first part of the Kasparov's law); small ensembles of significantly weaker readers can significantly outperform teams of stronger readers, supported by the same computational tool, when the judgments of the former ones are combined within "fit-for-use" protocols (in concordance with the second part of the Kasparov's law).

CONCLUSION

Our study shows that good interaction protocols can guarantee improved decision performance that easily surpasses the performance of individual agents, even of realistic super-human AI systems. This finding highlights the importance of focusing on how to guarantee better co-operation within human-AI teams, so to enable safer and more human sustainable care practices.

摘要

目的

近期，人工智能融入医疗实践备受推崇，因其有望提高这些实践的效率和效果。尽管如此，到目前为止，几乎没有研究旨在了解协作任务中最佳的人机交互协议，即使在当前更可行的场景中，如独立双读筛查任务。

方法

为此，我们报告了一项回顾性病例对照研究，该研究涉及12名获得董事会认证的放射科医生，通过磁共振成像检测膝关节病变，我们在8种双读协议中模拟了两种深度学习模型与人类的序列组合。受所谓的卡斯帕罗夫定律启发，我们研究人与人工智能模型的组合是否能比单独的人工智能模型取得更好的性能，以及能力较弱的阅片者在适用的交互协议支持下是否能超越能力较强的阅片者。

结果

我们讨论了两个主要发现：表现明显不如先进人工智能的人类群体，如果通过多数投票汇总他们的判断（与卡斯帕罗夫定律的第一部分一致），其表现会显著优于该人工智能；在“适用”协议内组合能力明显较弱的阅片者的判断时，由相同计算工具支持的能力较弱的阅片者小团队能显著超越能力较强的阅片者团队（与卡斯帕罗夫定律的第二部分一致）。

结论

我们的研究表明，良好的交互协议可以保证决策性能的提升，轻松超越个体智能体的性能，甚至是现实中的超人类人工智能系统。这一发现凸显了关注如何在人机团队中保证更好合作的重要性，以便实现更安全、更具人文可持续性的医疗实践。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4090/7865071/d28cec9a7246/13755_2021_138_Fig1_HTML.jpg

相似文献

Studying human-AI collaboration protocols: the case of the Kasparov's law in radiological double reading.

Health Inf Sci Syst. 2021 Feb 5;9(1):8. doi: 10.1007/s13755-021-00138-8. eCollection 2021 Dec.

Rams, hounds and white boxes: Investigating human-AI collaboration protocols in medical diagnosis.

Artif Intell Med. 2023 Apr;138:102506. doi: 10.1016/j.artmed.2023.102506. Epub 2023 Feb 8.

Using deep learning to assist readers during the arbitration process: a lesion-based retrospective evaluation of breast cancer screening performance.

Eur Radiol. 2022 Feb;32(2):842-852. doi: 10.1007/s00330-021-08217-w. Epub 2021 Aug 12.

Deep learning powered coronary CT angiography for detecting obstructive coronary artery disease: The effect of reader experience, calcification and image quality.

Eur J Radiol. 2021 Sep;142:109835. doi: 10.1016/j.ejrad.2021.109835. Epub 2021 Jun 27.

Impact of artificial intelligence support on accuracy and reading time in breast tomosynthesis image interpretation: a multi-reader multi-case study.

Eur Radiol. 2021 Nov;31(11):8682-8691. doi: 10.1007/s00330-021-07992-w. Epub 2021 May 4.

AI-based improvement in lung cancer detection on chest radiographs: results of a multi-reader study in NLST dataset.

Eur Radiol. 2021 Dec;31(12):9664-9674. doi: 10.1007/s00330-021-08074-7. Epub 2021 Jun 4.

Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence.

Radiology. 2022 Mar;302(3):627-636. doi: 10.1148/radiol.210937. Epub 2021 Dec 21.

Unity Is Intelligence: A Collective Intelligence Experiment on ECG Reading to Improve Diagnostic Performance in Cardiology.

J Intell. 2021 Apr 1;9(2):17. doi: 10.3390/jintelligence9020017.

Today's radiologists meet tomorrow's AI: the promises, pitfalls, and unbridled potential.

Quant Imaging Med Surg. 2021 Jun;11(6):2775-2779. doi: 10.21037/qims-20-1083.

Retrospective comparison between single reading plus an artificial intelligence algorithm and two-view digital tomosynthesis with double reading in breast screening.

J Med Screen. 2021 Sep;28(3):365-368. doi: 10.1177/0969141320984198. Epub 2021 Jan 5.

引用本文的文献

A vision of human-AI collaboration for enhanced biological collection curation and research.

Bioscience. 2025 Mar 28;75(6):457-471. doi: 10.1093/biosci/biaf021. eCollection 2025 Jun.

Machine learning in dentistry and oral surgery: charting the course with bibliometric insights.

Head Face Med. 2025 Jun 4;21(1):44. doi: 10.1186/s13005-025-00521-w.

Enhancing Radiologist Productivity with Artificial Intelligence in Magnetic Resonance Imaging (MRI): A Narrative Review.

Diagnostics (Basel). 2025 Apr 30;15(9):1146. doi: 10.3390/diagnostics15091146.

On prediction-modelers and decision-makers: why fairness requires more than a fair prediction model.

AI Soc. 2025;40(2):353-369. doi: 10.1007/s00146-024-01886-3. Epub 2024 Mar 16.

Human-AI collaboration in large language model-assisted brain MRI differential diagnosis: a usability study.

Eur Radiol. 2025 Mar 7. doi: 10.1007/s00330-025-11484-6.

Second opinion machine learning for fast-track pathway assignment in hip and knee replacement surgery: the use of patient-reported outcome measures.

BMC Med Inform Decis Mak. 2024 Jul 23;24(Suppl 4):203. doi: 10.1186/s12911-024-02602-3.

Automation in ART: Paving the Way for the Future of Infertility Treatment.

Reprod Sci. 2023 Apr;30(4):1006-1016. doi: 10.1007/s43032-022-00941-y. Epub 2022 Aug 3.

Unity Is Intelligence: A Collective Intelligence Experiment on ECG Reading to Improve Diagnostic Performance in Cardiology.

J Intell. 2021 Apr 1;9(2):17. doi: 10.3390/jintelligence9020017.

本文引用的文献

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.

Lancet Digit Health. 2019 Oct;1(6):e271-e297. doi: 10.1016/S2589-7500(19)30123-2. Epub 2019 Sep 25.

As if sand were stone. New concepts and metrics to probe the ground on which to build trustable AI.

BMC Med Inform Decis Mak. 2020 Sep 11;20(1):219. doi: 10.1186/s12911-020-01224-9.

External Evaluation of 3 Commercial Artificial Intelligence Algorithms for Independent Assessment of Screening Mammograms.

JAMA Oncol. 2020 Oct 1;6(10):1581-1588. doi: 10.1001/jamaoncol.2020.3321.

Early experience utilizing artificial intelligence shows significant reduction in transfer times and length of stay in a hub and spoke model.

Interv Neuroradiol. 2020 Oct;26(5):615-622. doi: 10.1177/1591019920953055. Epub 2020 Aug 26.

Teamwork in clinical reasoning - cooperative or parallel play?

Diagnosis (Berl). 2020 Aug 27;7(3):307-312. doi: 10.1515/dx-2020-0020.

Classification of glomerular pathological findings using deep learning and nephrologist-AI collective intelligence approach.

Int J Med Inform. 2020 Sep;141:104231. doi: 10.1016/j.ijmedinf.2020.104231. Epub 2020 Jul 11.

The Economic Impact of the COVID-19 Pandemic on Radiology Practices.

Radiology. 2020 Sep;296(3):E141-E144. doi: 10.1148/radiol.2020201495. Epub 2020 Apr 15.

Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies.

BMJ. 2020 Mar 25;368:m689. doi: 10.1136/bmj.m689.

Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury.

J Clin Epidemiol. 2020 Jun;122:95-107. doi: 10.1016/j.jclinepi.2020.03.005. Epub 2020 Mar 20.

AI outperforms radiologists in mammographic screening.

Nat Rev Clin Oncol. 2020 Mar;17(3):134. doi: 10.1038/s41571-020-0329-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

研究人机协作协议：以放射学双人读片中的卡斯帕罗夫定律为例。

Studying human-AI collaboration protocols: the case of the Kasparov's law in radiological double reading.

作者信息

Cabitza Federico, Campagner Andrea, Sconfienza Luca Maria

机构信息

Università degli Studi di Milano-Bicocca, Viale Sarca 336, 20126 Milan, Italy.

Department of Biomedical Sciences for Health, University of Milan, Milan, Italy.