Suppr超能文献

用于文本标注的开源语言模型:模型设置与微调实用指南。

Open-source LLMs for text annotation: a practical guide for model setting and fine-tuning.

作者信息

Alizadeh Meysam, Kubli Maël, Samei Zeynab, Dehghani Shirin, Zahedivafa Mohammadmasiha, Bermeo Juan D, Korobeynikova Maria, Gilardi Fabrizio

机构信息

Department of Political Science, University of Zurich, 8050 Zurich, Switzerland.

Department of Computer Science, Institute for Fundamental Research, Tehran, Iran.

出版信息

J Comput Soc Sci. 2025;8(1):17. doi: 10.1007/s42001-024-00345-9. Epub 2024 Dec 18.

Abstract

UNLABELLED

This paper studies the performance of open-source Large Language Models (LLMs) in text classification tasks typical for political science research. By examining tasks like stance, topic, and relevance classification, we aim to guide scholars in making informed decisions about their use of LLMs for text analysis and to establish a baseline performance benchmark that demonstrates the models' effectiveness. Specifically, we conduct an assessment of both zero-shot and fine-tuned LLMs across a range of text annotation tasks using news articles and tweets datasets. Our analysis shows that fine-tuning improves the performance of open-source LLMs, allowing them to match or even surpass zero-shot GPT 3.5 and GPT-4, though still lagging behind fine-tuned GPT 3.5. We further establish that fine-tuning is preferable to few-shot training with a relatively modest quantity of annotated text. Our findings show that fine-tuned open-source LLMs can be effectively deployed in a broad spectrum of text annotation applications. We provide a Python notebook facilitating the application of LLMs in text annotation for other researchers.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s42001-024-00345-9.

摘要

未标注

本文研究了开源大语言模型(LLMs)在政治学研究典型的文本分类任务中的表现。通过考察立场、主题和相关性分类等任务,我们旨在指导学者在使用大语言模型进行文本分析时做出明智决策,并建立一个基线性能基准来证明这些模型的有效性。具体而言,我们使用新闻文章和推文数据集,对一系列文本标注任务中的零样本和微调大语言模型进行了评估。我们的分析表明,微调提高了开源大语言模型的性能,使其能够匹配甚至超越零样本的GPT 3.5和GPT - 4,尽管仍落后于微调后的GPT 3.5。我们进一步确定,在标注文本数量相对较少的情况下,微调优于少样本训练。我们的研究结果表明,微调后的开源大语言模型可以有效地部署在广泛的文本标注应用中。我们提供了一个Python笔记本,方便其他研究人员在文本标注中应用大语言模型。

补充信息

在线版本包含可在10.1007/s42001 - 024 - 00345 - 9获取的补充材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/718a/11655591/d643c6d935df/42001_2024_345_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验