GPT-4生成符合当前肿瘤学指南的准确且易读的患者教育材料：一项随机评估。 - Suppr | 超能文献

Abstract

INTRODUCTION AND AIM

Guideline-based patient educational materials (PEMs) empower patients and reduce misinformation, but require frequent updates and must be adapted to the readability level of patients. The aim is to assess whether generative artificial intelligence (GenAI) can provide readable, accurate, and up-to-date PEMs that can be subsequently translated into multiple languages for broad dissemination.

STUDY DESIGN AND METHODS

The European Association of Urology (EAU) guidelines for prostate, bladder, kidney, and testicular cancer were used as the knowledge base for GPT-4 to generate PEMs. Additionally, the PEMs were translated into five commonly spoken languages within the European Union (EU). The study was conducted through a single-blinded, online randomized assessment survey. After an initial pilot assessment of the GenAI-generated PEMs, thirty-two members of the Young Academic Urologists (YAU) groups evaluated the accuracy, completeness, and clarity of the original versus GPT-generated PEMs. The translation assessment involved two native speakers from different YAU groups for each language: Dutch, French, German, Italian, and Spanish. The primary outcomes were readability, accuracy, completeness, faithfulness, and clarity. Readability was measured using Flesch Kincaid Reading Ease (FKRE), Flesch Kincaid Grade Level (FKGL), Gunning Fog (GFS) scores and Smog (SI), Coleman Liau (CLI), Automated Readability (ARI) indexes. Accuracy, completeness, faithfulness, and clarity were rated on a 5-item Likert scale.

RESULTS

The mean time to create layperson PEMs based on the latest guideline by GPT-4 was 52.1 seconds. The readability scores for the 8 original PEMs were lower than for the 8 GPT-4-generated PEMs (Mean FKRE: 43.5 vs. 70.8; p < .001). The required reading education levels were higher for original PEMs compared to GPT-4 generated PEMs (Mean FKGL: 11.6 vs. 6.1; p < .001). For all urological localized cancers, the original PEMs were not significantly different from the GPT-4 generated PEMs in accuracy, completeness, and clarity. Similarly, no differences were observed for metastatic cancers. Translations of GPT-generated PEMs were rated as faithful in 77.5% of cases and clear in 67.5% of cases.

CONCLUSIONS AND RELEVANCE

GPT-4 generated PEMs have better readability levels compared to original PEMs while maintaining similar accuracy, completeness, and clarity. The use of GenAI's information extraction and language capabilities, integrated with human oversight, can significantly reduce the workload and ensure up-to-date and accurate PEMs.

PATIENT SUMMARY

Some cancer facts made for patients can be hard to read or not in the right words for those with prostate, bladder, kidney, or testicular cancer. This study used AI to quickly make short and easy-to-read content from trusted facts. Doctors checked the AI content and found that they were just as accurate, complete, and clear as the original text made for patients. They also worked well in many languages. This AI tool can assist providers in making it easier for patients to understand their cancer and the best care they can get.

摘要

引言与目的

基于指南的患者教育材料（PEMs）能增强患者能力并减少错误信息，但需要频繁更新且必须适应患者的阅读水平。目的是评估生成式人工智能（GenAI）能否提供可读、准确且最新的PEMs，随后将其翻译成多种语言以广泛传播。

研究设计与方法

欧洲泌尿外科学会（EAU）关于前列腺癌、膀胱癌、肾癌和睾丸癌的指南被用作GPT - 4生成PEMs的知识库。此外，这些PEMs被翻译成欧盟内五种常用语言。该研究通过单盲在线随机评估调查进行。在对GenAI生成的PEMs进行初步试点评估后，32名青年学术泌尿外科医生（YAU）小组的成员评估了原始PEMs与GPT生成的PEMs在准确性、完整性和清晰度方面的差异。翻译评估为每种语言安排了两名来自不同YAU小组的母语人士：荷兰语、法语、德语、意大利语和西班牙语。主要结果包括可读性、准确性、完整性、忠实性和清晰度。可读性使用弗莱什·金凯德阅读简易度（FKRE）、弗莱什·金凯德年级水平（FKGL）、冈宁雾度（GFS）得分和烟雾指数（SI）、科尔曼·廖指数（CLI）、自动可读性指数（ARI）进行测量。准确性、完整性、忠实性和清晰度通过5项李克特量表进行评分。

结果

GPT - 4根据最新指南创建面向非专业人士的PEMs的平均时间为52.1秒。8份原始PEMs的可读性得分低于8份GPT - 4生成的PEMs（平均FKRE：43.5对70.8；p <.001）。与GPT - 4生成的PEMs相比，原始PEMs所需的阅读教育水平更高（平均FKGL：11.6对6.1；p <.001）。对于所有泌尿系统局限性癌症，原始PEMs与GPT - 4生成的PEMs在准确性、完整性和清晰度方面无显著差异。同样，转移性癌症也未观察到差异。GPT生成的PEMs的翻译在77.5%的情况下被评为忠实，在67.5%的情况下被评为清晰。

结论与意义

与原始PEMs相比，GPT - 4生成的PEMs具有更好的可读性水平，同时保持了相似的准确性、完整性和清晰度。利用GenAI的信息提取和语言能力，并结合人工监督，可以显著减少工作量并确保PEMs是最新且准确的。

患者总结

一些为患者编写的癌症相关内容对于前列腺癌、膀胱癌、肾癌或睾丸癌患者来说可能难以阅读或表述不当。本研究使用人工智能从可靠事实中快速生成简短且易于阅读的内容。医生检查了人工智能生成的内容，发现它们与为患者编写的原始文本一样准确、完整且清晰。它们在多种语言中也表现良好。这种人工智能工具可以帮助医疗服务提供者让患者更容易理解自己的癌症以及所能获得的最佳治疗。