Suppr超能文献

基于大语言模型的韩国青少年回答生成:一项使用带有检索增强生成的NAVER知识问答数据集的研究。

LLM-Based Response Generation for Korean Adolescents: A Study Using the NAVER Knowledge iN Q&A Dataset with RAG.

作者信息

Kim Junseo, Kim Seok Jun, Ahn Junseok, Lee Suehyun

机构信息

Department of Computer Engineering, College of IT Convergence, Gachon University, Seongnam, Korea.

Department of IT Convergence, Graduate School, Gachon University, Seongnam, Korea.

出版信息

Healthc Inform Res. 2025 Apr;31(2):136-145. doi: 10.4258/hir.2025.31.2.136. Epub 2025 Apr 30.

Abstract

OBJECTIVES

This research aimed to develop a retrieval-augmented generation (RAG) based large language model (LLM) system that offers personalized and reliable responses to a wide range of concerns raised by Korean adolescents. Our work focuses on building a culturally reflective dataset and on designing and validating the system's effectiveness by comparing the answer quality of RAG-based models with non-RAG models.

METHODS

Data were collected from the NAVER Knowledge iN platform, concentrating on posts that featured adolescents' questions and corresponding expert responses during the period 2014-2024. The dataset comprises 3,874 cases, categorized by key negative emotions and the primary sources of worry. The data were processed to remove irrelevant or redundant content and then classified into general and detailed causes. The RAG-based model employed FAISS for similarity-based retrieval of the top three reference cases and used GPT-4o mini for response generation. The responses generated with and without RAG were evaluated using several metrics.

RESULTS

RAG-based responses outperformed non-RAG responses across all evaluation metrics. Key findings indicate that RAG-based responses delivered more specific, empathetic, and actionable guidance, particularly when addressing complex emotional and situational concerns. The analysis revealed that family relationships, peer interactions, and academic stress are significant factors affecting adolescents' worries, with depression and stress frequently co-occurring.

CONCLUSIONS

This study demonstrates the potential of RAG-based LLMs to address the diverse and culture-specific worries of Korean adolescents. By integrating external knowledge and offering personalized support, the proposed system provides a scalable approach to enhancing mental health interventions for adolescents. Future research should concentrate on expanding the dataset and improving multiturn conversational capabilities to deliver even more comprehensive support.

摘要

目标

本研究旨在开发一个基于检索增强生成(RAG)的大语言模型(LLM)系统,该系统能够针对韩国青少年提出的各种问题提供个性化且可靠的回答。我们的工作重点是构建一个反映文化特色的数据集,并通过比较基于RAG的模型与非RAG模型的答案质量来设计和验证该系统的有效性。

方法

数据收集自NAVER知识iN平台,重点关注2014年至2024年期间以青少年问题及相应专家回答为特色的帖子。该数据集包含3874个案例,按关键负面情绪和主要担忧来源进行分类。对数据进行处理以去除无关或冗余内容,然后分为一般原因和详细原因。基于RAG的模型采用FAISS进行基于相似度的前三个参考案例检索,并使用GPT-4o mini进行回答生成。使用多种指标对有无RAG生成的回答进行评估。

结果

在所有评估指标上,基于RAG的回答均优于非RAG回答。主要发现表明,基于RAG的回答提供了更具体、更具同理心且可操作的指导,尤其是在解决复杂的情绪和情境问题时。分析显示,家庭关系、同伴互动和学业压力是影响青少年担忧的重要因素,抑郁和压力经常同时出现。

结论

本研究证明了基于RAG的大语言模型在解决韩国青少年多样化且特定文化的担忧方面的潜力。通过整合外部知识并提供个性化支持,所提出的系统为加强青少年心理健康干预提供了一种可扩展的方法。未来的研究应集中在扩大数据集和提高多轮对话能力,以提供更全面的支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fce5/12086440/69f237b972d9/hir-2025-31-2-136f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验