Komp John, Boggaram Aaptha, Kao David P, Trivedi Ashutosh, Rosenberg Michael A
College of Engineering and Applied Science, University of Colorado, Boulder, CO, USA.
Division of Cardiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
Med Res Arch. 2025 Mar;13(3). doi: 10.18103/mra.v13i3.6363. Epub 2025 Mar 29.
The programming of cardiac implantable electronic devices, such as pacemakers and implantable defibrillators, represents a promising domain for the application of automated learning systems. These systems, leveraging a type of artificial intelligence called reinforcement learning, have the potential to personalize medical treatment by adapting device settings based on an individual's physiological responses. At the core of these self-learning algorithms is the principle of balancing exploration and exploitation. Exploitation refers to the selection of device programming settings previously demonstrated to provide clinical benefit, while exploration refers to the real-time search for adjustments to device programming that could provide an improvement in clinical outcomes for each individual. Exploration is a critical component of the reinforcement learning algorithm, and provides the opportunity to identify settings that could directly benefit individual patients. However, unconstrained exploration poses risks, as an automated change in certain settings may lead to adverse clinical outcomes. To mitigate these risks, several strategies have been proposed to ensure that algorithm-driven programming changes achieve the desired level of individualized optimization without compromising patient safety. In this review, we examine the existing literature on safe reinforcement learning algorithms in automated systems and discuss their potential application to the programming of cardiac implantable electronic devices.
心脏植入式电子设备(如起搏器和植入式除颤器)的编程是自动学习系统应用的一个有前景的领域。这些系统利用一种称为强化学习的人工智能,有潜力通过根据个体的生理反应调整设备设置来实现个性化医疗。这些自学习算法的核心是平衡探索与利用的原则。利用是指选择先前已证明能提供临床益处的设备编程设置,而探索是指实时搜索对设备编程的调整,以改善每个个体的临床结果。探索是强化学习算法的关键组成部分,为识别可能直接使个体患者受益的设置提供了机会。然而,无约束的探索存在风险,因为某些设置的自动更改可能导致不良临床结果。为降低这些风险,已提出了几种策略,以确保算法驱动的编程更改在不危及患者安全的情况下达到所需的个性化优化水平。在这篇综述中,我们研究了关于自动系统中安全强化学习算法的现有文献,并讨论了它们在心脏植入式电子设备编程中的潜在应用。