Campbell Nicole, Kalabalik-Hoganson Julie
Department of Pharmacy Practice, Fairleigh Dickinson University, Florham Park, NJ, USA.
Int J Clin Pharm. 2025 Jun 10. doi: 10.1007/s11096-025-01947-7.
Pharmaceutical calculations are required elements of the Doctor of Pharmacy curriculum in the United States. With the growth of artificial intelligence chatbots, pharmacists and educators are exploring their application. The accuracy of artificial intelligence chatbots in performing pharmaceutical calculations remains unknown.
To evaluate the accuracy of artificial intelligence chatbots for pharmaceutical calculations.
Eleven free-access chatbots were tested using 7 faculty-generated questions: 1 control, 2 creatinine clearance, 1 oral to intravenous dose conversion, 2 antibiotic pharmacokinetic dosing, and 1 number needed to harm. Descriptive statistics were used to evaluate the primary outcome, which was proportion of correct responses. Secondary outcomes included types of errors and teachability.
Ten (90.9%) chatbots answered the control question correctly, and all answered the dose conversion question correctly. Eight (72.7%) chatbots correctly calculated number needed to harm. Only 1 (9.1%) provided the correct antibiotic dosing, and none correctly calculated creatinine clearance. Common errors included incorrect weight selection for creatinine clearance and use of incorrect formulas. Nine (81.8%) chatbots were teachable on at least 1 question.
Artificial intelligence chatbots demonstrated limited accuracy for multi-step pharmaceutical calculations and may be more reliable for low complexity calculations.
在美国,药物计算是药学博士课程的必修内容。随着人工智能聊天机器人的发展,药剂师和教育工作者正在探索其应用。人工智能聊天机器人在进行药物计算时的准确性尚不清楚。
评估人工智能聊天机器人进行药物计算的准确性。
使用7个由教师提出的问题对11个免费访问的聊天机器人进行测试:1个对照问题、2个肌酐清除率问题、1个口服至静脉剂量转换问题、2个抗生素药代动力学给药问题和1个伤害所需人数问题。使用描述性统计来评估主要结果,即正确回答的比例。次要结果包括错误类型和可教性。
10个(90.9%)聊天机器人正确回答了对照问题,所有聊天机器人都正确回答了剂量转换问题。8个(72.7%)聊天机器人正确计算出了伤害所需人数。只有1个(9.1%)提供了正确的抗生素给药计算结果,没有一个正确计算出肌酐清除率。常见错误包括肌酐清除率体重选择错误和公式使用错误。9个(81.8%)聊天机器人至少在1个问题上具有可教性。
人工智能聊天机器人在多步骤药物计算中准确性有限,在低复杂性计算中可能更可靠。