Google Health, London, UK.
DeepMind, London, UK.
BMC Med. 2019 Oct 29;17(1):195. doi: 10.1186/s12916-019-1426-2.
Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice.
Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes.
The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.
人工智能(AI)在医疗保健领域的研究正在迅速加速,其潜在应用在医学的各个领域都得到了证明。然而,目前将这些技术成功应用于临床实践的例子还很有限。本文探讨了 AI 在医疗保健中的主要挑战和局限性,并考虑了将这些潜在变革性技术从研究转化为临床实践所需的步骤。
AI 系统在医疗保健中的转化所面临的主要挑战包括机器学习科学固有的挑战、实施中的后勤困难,以及对采用障碍以及必要的社会文化或途径变化的考虑。作为产生证据的金标准,应将作为一部分的随机对照试验中的经过严格同行评审的临床评估视为金标准,但在实践中进行这些评估并不总是合适或可行的。性能指标应旨在捕捉真实的临床适用性,并让预期用户能够理解。需要进行监管,在平衡创新步伐与潜在危害的同时,进行深思熟虑的上市后监测,以确保患者不会接触到危险的干预措施,也不会被剥夺有益创新的机会。必须开发能够使 AI 系统进行直接比较的机制,包括使用独立、本地和代表性的测试集。AI 算法的开发人员必须警惕潜在的危险,包括数据偏移、意外地将混杂因素拟合、意外的歧视性偏差、对新人群的推广能力的挑战,以及新算法对健康结果的意外负面影响。
将 AI 研究安全、及时地转化为经过临床验证和适当监管的系统,使每个人都能从中受益,这是具有挑战性的。使用对临床医生直观的指标进行稳健的临床评估,并且理想情况下超越技术准确性的措施,包括护理质量和患者结果,是至关重要的。需要进一步开展以下工作:(1) 在开发减轻这些问题的措施的同时,确定算法偏差和不公平性的主题;(2) 减少脆性并提高推广能力;(3) 开发改善机器学习预测的可解释性的方法。如果这些目标能够实现,那么患者的受益将是变革性的。