• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: a Prompt Engineering Study.用于遗传性癌症综合征阳性基因筛查结果反馈的聊天机器人:一项提示工程研究
Res Sq. 2024 Aug 29:rs.3.rs-4986527. doi: 10.21203/rs.3.rs-4986527/v1.
2
Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project.遗传性癌症综合征阳性基因筛查结果返回的聊天机器人:提示工程设计项目
JMIR Cancer. 2025 Jun 10;11:e65848. doi: 10.2196/65848.
3
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.
4
Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation.使用多智能体方法构建用于心理健康教育的信息聊天机器人以提高对提示指令的依从性:算法开发与验证
JMIR AI. 2025 Mar 26;4:e69820. doi: 10.2196/69820.
5
Evaluating and Enhancing Japanese Large Language Models for Genetic Counseling Support: Comparative Study of Domain Adaptation and the Development of an Expert-Evaluated Dataset.评估和增强用于遗传咨询支持的日本大语言模型:领域适应的比较研究与专家评估数据集的开发
JMIR Med Inform. 2025 Jan 16;13:e65047. doi: 10.2196/65047.
6
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
7
Evaluating Large Language Models for Automated Reporting and Data Systems Categorization: Cross-Sectional Study.评估用于自动报告和数据系统分类的大语言模型:横断面研究。
JMIR Med Inform. 2024 Jul 17;12:e55799. doi: 10.2196/55799.
8
[Optimized interaction with Large Language Models : A practical guide to Prompt Engineering and Retrieval-Augmented Generation].[与大语言模型的优化交互:提示工程和检索增强生成实用指南]
Radiologie (Heidelb). 2025 Apr;65(4):235-242. doi: 10.1007/s00117-025-01416-2. Epub 2025 Feb 21.
9
Accuracy of Current Large Language Models and the Retrieval-Augmented Generation Model in Determining Dietary Principles in Chronic Kidney Disease.当前大语言模型及检索增强生成模型在确定慢性肾脏病饮食原则方面的准确性
J Ren Nutr. 2025 May;35(3):401-409. doi: 10.1053/j.jrn.2025.01.004. Epub 2025 Jan 24.
10
Optimizing theranostics chatbots with context-augmented large language models.利用上下文增强大语言模型优化治疗诊断聊天机器人。
Theranostics. 2025 Apr 21;15(12):5693-5704. doi: 10.7150/thno.107757. eCollection 2025.

本文引用的文献

1
ChatGPT and assistive AI in structured radiology reporting: A systematic review.ChatGPT 和辅助人工智能在结构化放射学报告中的应用:一项系统评价。
Curr Probl Diagn Radiol. 2024 Nov-Dec;53(6):728-737. doi: 10.1067/j.cpradiol.2024.07.007. Epub 2024 Jul 9.
2
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4.评估生成式预训练转换器(GPT)在临床决策中的应用:GPT-3.5 和 GPT-4 的对比分析。
J Med Internet Res. 2024 Jun 27;26:e54571. doi: 10.2196/54571.
3
A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions.ChatGPT 3.5 和 ChatGPT 4 在回答选定遗传学问题方面的比较评估。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2271-2283. doi: 10.1093/jamia/ocae128.
4
On the Responsible Use of Chatbots in Bioinformatics.关于生物信息学中聊天机器人的合理使用
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzae002.
5
GastroBot: a Chinese gastrointestinal disease chatbot based on the retrieval-augmented generation.GastroBot:一个基于检索增强生成技术的中文胃肠疾病聊天机器人。
Front Med (Lausanne). 2024 May 22;11:1392555. doi: 10.3389/fmed.2024.1392555. eCollection 2024.
6
Optimizing large language models in digestive disease: strategies and challenges to improve clinical outcomes.优化消化疾病中的大语言模型:改善临床结局的策略和挑战。
Liver Int. 2024 Sep;44(9):2114-2124. doi: 10.1111/liv.15974. Epub 2024 May 31.
7
Timely need for navigating the potential and downsides of LLMs in healthcare and biomedicine.及时需要探索大语言模型在医疗保健和生物医学领域的潜力与弊端。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae214.
8
Achieving health equity through conversational AI: A roadmap for design and implementation of inclusive chatbots in healthcare.通过对话式人工智能实现健康公平:医疗保健领域包容性聊天机器人的设计与实施路线图。
PLOS Digit Health. 2024 May 2;3(5):e0000492. doi: 10.1371/journal.pdig.0000492. eCollection 2024 May.
9
Empowering personalized pharmacogenomics with generative AI solutions.利用生成式人工智能解决方案增强个性化药物基因组学。
J Am Med Inform Assoc. 2024 May 20;31(6):1356-1366. doi: 10.1093/jamia/ocae039.
10
A Comparative Analysis of AI Models in Complex Medical Decision-Making Scenarios: Evaluating ChatGPT, Claude AI, Bard, and Perplexity.复杂医疗决策场景中人工智能模型的比较分析:评估ChatGPT、Claude AI、Bard和Perplexity
Cureus. 2024 Jan 18;16(1):e52485. doi: 10.7759/cureus.52485. eCollection 2024 Jan.

用于遗传性癌症综合征阳性基因筛查结果反馈的聊天机器人:一项提示工程研究

Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: a Prompt Engineering Study.

作者信息

Coen Emma, Del Fiol Guilherme, Kaphingst Kimberly A, Borsato Emerson, Shannon Jackie, Smith Hadley Stevens, Masino Aaron, Allen Caitlin G

机构信息

Clemson University.

University of Utah.

出版信息

Res Sq. 2024 Aug 29:rs.3.rs-4986527. doi: 10.21203/rs.3.rs-4986527/v1.

DOI:10.21203/rs.3.rs-4986527/v1
PMID:39257988
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11384791/
Abstract

BACKGROUND

The growing demand for genomic testing and limited access to experts necessitate innovative service models. While chatbots have shown promise in supporting genomic services like pre-test counseling, their use in returning positive genetic results, especially using the more recent large language models (LLMs) remains unexplored.

OBJECTIVE

This study reports the prompt engineering process and intrinsic evaluation of the LLM component of a chatbot designed to support returning positive population-wide genomic screening results.

METHODS

We used a three-step prompt engineering process, including Retrieval-Augmented Generation (RAG) and few-shot techniques to develop an open-response chatbot. This was then evaluated using two hypothetical scenarios, with experts rating its performance using a 5-point Likert scale across eight criteria: tone, clarity, program accuracy, domain accuracy, robustness, efficiency, boundaries, and usability.

RESULTS

The chatbot achieved an overall score of 3.88 out of 5 across all criteria and scenarios. The highest ratings were in Tone (4.25), Usability (4.25), and Boundary management (4.0), followed by Efficiency (3.88), Clarity and Robustness (3.81), and Domain Accuracy (3.63). The lowest-rated criterion was Program Accuracy, which scored 3.25.

DISCUSSION

The LLM handled open-ended queries and maintained boundaries, while the lower Program Accuracy rating indicates areas for improvement. Future work will focus on refining prompts, expanding evaluations, and exploring optimal hybrid chatbot designs that integrate LLM components with rule-based chatbot components to enhance genomic service delivery.

摘要

背景

对基因组检测的需求不断增长,且获取专家服务的机会有限,这就需要创新的服务模式。虽然聊天机器人在支持诸如检测前咨询等基因组服务方面已显示出前景,但它们在返回阳性基因检测结果方面的应用,尤其是使用更新的大语言模型(LLM)的情况仍未得到探索。

目的

本研究报告了一个旨在支持返回全人群基因组筛查阳性结果的聊天机器人的大语言模型组件的提示工程过程和内在评估。

方法

我们采用了一个三步提示工程过程,包括检索增强生成(RAG)和少样本技术来开发一个开放式响应聊天机器人。然后使用两个假设场景对其进行评估,专家们使用5点李克特量表对其在八个标准上的表现进行评分:语气、清晰度、程序准确性、领域准确性、稳健性、效率、边界和可用性。

结果

在所有标准和场景下,聊天机器人的总体得分为3.88分(满分5分)。评分最高的是语气(4.25分)、可用性(4.25分)和边界管理(4.0分),其次是效率(3.88分)、清晰度和稳健性(3.81分)以及领域准确性(3.63分)。评分最低的标准是程序准确性,得分为3.25分。

讨论

大语言模型处理了开放式查询并保持了边界,而较低的程序准确性评分表明存在改进的空间。未来的工作将集中在完善提示、扩大评估以及探索将大语言模型组件与基于规则的聊天机器人组件相结合的最佳混合聊天机器人设计,以增强基因组服务的提供。