Priyadarshana Y H P P, Liang Zilu, Piumarta Ian
Kyoto University of Advanced Science (KUAS), Kyoto, Japan.
Front Artif Intell. 2025 May 21;8:1564828. doi: 10.3389/frai.2025.1564828. eCollection 2025.
Few-shot prompting in large language models (LLMs) significantly improves performance across various tasks, including both in-domain and previously unseen natural language tasks, by learning from limited in-context examples. How these examples enhance transferability and contribute to achieving state-of-the-art (SOTA) performance in downstream tasks remains unclear. To address this, we propose , a novel LLM transferability framework designed to clarify the selection of the most relevant examples using synthetic free-text explanations. Our novel hybrid method ranks LLM-generated explanations by selecting the most semantically relevant examples closest to the input query while balancing diversity. The top-ranked explanations, along with few-shot examples, are then used to enhance LLMs' knowledge transfer in multi-party conversational modeling for previously unseen depression detection tasks. Evaluations using the IMHI corpus demonstrate that consistently produces high-quality free-text explanations. Extensive experiments on depression detection tasks, including depressed utterance classification (DUC) and depressed speaker identification (DSI), show that achieves SOTA performance. The evaluation results indicate significant improvements, with up to 20.59% in recall for DUC and 21.58% in F1 scores for DSI, using 5-shot examples with top-ranked explanations in the RSDD and eRisk 18 T2 corpora. These findings underscore 's potential as an effective screening tool for digital mental health applications.
大语言模型(LLMs)中的少样本提示通过从有限的上下文示例中学习,显著提高了各种任务的性能,包括领域内和以前未见过的自然语言任务。这些示例如何增强可迁移性并有助于在下游任务中实现当前最优(SOTA)性能仍不清楚。为了解决这个问题,我们提出了一种新颖的LLM可迁移性框架,旨在使用合成自由文本解释来阐明最相关示例的选择。我们新颖的混合方法通过选择最接近输入查询的语义最相关示例同时平衡多样性,对LLM生成的解释进行排名。然后,排名靠前的解释与少样本示例一起,用于增强LLMs在多方对话建模中的知识迁移,以处理以前未见过的抑郁症检测任务。使用IMHI语料库进行的评估表明,该框架始终能生成高质量的自由文本解释。在抑郁症检测任务上进行的广泛实验,包括抑郁话语分类(DUC)和抑郁说话者识别(DSI),表明该框架实现了SOTA性能。评估结果显示出显著的改进,在RSDD和eRisk 18 T2语料库中,使用带有排名靠前解释的5样本示例时,DUC的召回率提高了20.59%,DSI的F1分数提高了21.58%。这些发现强调了该框架作为数字心理健康应用有效筛查工具的潜力。