• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于大语言模型的混合方法,用于增强自动作文评分。

An LLM-based hybrid approach for enhanced automated essay scoring.

作者信息

Atkinson John, Palma Diego

机构信息

AI Empowered, Santiago, Chile.

出版信息

Sci Rep. 2025 Apr 25;15(1):14551. doi: 10.1038/s41598-025-87862-3.

DOI:10.1038/s41598-025-87862-3
PMID:40280963
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12032364/
Abstract

Automated Essay Scoring systems have traditionally relied on shallow lexical data, such as word frequency and sentence length, to assess essays. However, these approaches neglect crucial factors like text structure and semantics, resulting in limited evaluations of coherence and quality. To address these limitations, we propose a hybrid approach to AES that combines multiple features from different linguistic levels. By leveraging the complementary nature of these features, our model captures the intricate relationships underlying coherent texts. Through extensive experimentation using standard essay datasets, we demonstrate that our large language model based hybrid model surpasses state-of-the-art methods based on shallow features and pure neural networks. This research represents a significant advancement towards the development of an accurate and effective tool for assessing student writing.

摘要

自动作文评分系统传统上依赖于浅层词汇数据,如词频和句子长度,来评估作文。然而,这些方法忽略了诸如文本结构和语义等关键因素,导致对连贯性和质量的评估有限。为了解决这些局限性,我们提出了一种用于自动作文评分的混合方法,该方法结合了来自不同语言层面的多个特征。通过利用这些特征的互补性,我们的模型捕捉到连贯文本背后的复杂关系。通过使用标准作文数据集进行广泛实验,我们证明了基于大语言模型的混合模型优于基于浅层特征和纯神经网络的现有方法。这项研究代表了朝着开发一种准确有效的学生写作评估工具迈出的重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c6/12032364/c7ec2c34d7cc/41598_2025_87862_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c6/12032364/251e4fe429ee/41598_2025_87862_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c6/12032364/c7ec2c34d7cc/41598_2025_87862_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c6/12032364/251e4fe429ee/41598_2025_87862_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c6/12032364/c7ec2c34d7cc/41598_2025_87862_Fig2_HTML.jpg

相似文献

1
An LLM-based hybrid approach for enhanced automated essay scoring.一种基于大语言模型的混合方法,用于增强自动作文评分。
Sci Rep. 2025 Apr 25;15(1):14551. doi: 10.1038/s41598-025-87862-3.
2
Automated language essay scoring systems: a literature review.自动化语言作文评分系统:文献综述
PeerJ Comput Sci. 2019 Aug 12;5:e208. doi: 10.7717/peerj-cs.208. eCollection 2019.
3
Automated essay scoring with SBERT embeddings and LSTM-Attention networks.
PeerJ Comput Sci. 2025 Feb 11;11:e2634. doi: 10.7717/peerj-cs.2634. eCollection 2025.
4
Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.胸部X光病理学中的自动放射学报告标注:大语言模型框架的开发与评估
JMIR Med Inform. 2025 Mar 28;13:e68618. doi: 10.2196/68618.
5
Improved biomedical word embeddings in the transformer era.Transformer 时代改进的生物医学词向量。
J Biomed Inform. 2021 Aug;120:103867. doi: 10.1016/j.jbi.2021.103867. Epub 2021 Jul 18.
6
SensitiveCancerGPT: Leveraging Generative Large Language Model on Structured Omics Data to Optimize Drug Sensitivity Prediction.敏感癌症GPT:利用生成式大语言模型处理结构化组学数据以优化药物敏感性预测。
bioRxiv. 2025 Mar 3:2025.02.27.640661. doi: 10.1101/2025.02.27.640661.
7
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
8
Harnessing LLMs for multi-dimensional writing assessment: Reliability and alignment with human judgments.利用大语言模型进行多维度写作评估:可靠性及与人工评判的一致性
Heliyon. 2024 Jul 10;10(14):e34262. doi: 10.1016/j.heliyon.2024.e34262. eCollection 2024 Jul 30.
9
An automated essay scoring systems: a systematic literature review.一种自动作文评分系统:系统文献综述。
Artif Intell Rev. 2022;55(3):2495-2527. doi: 10.1007/s10462-021-10068-2. Epub 2021 Sep 23.
10
Assessing the use of multiple sources in student essays.评估学生论文中多源材料的使用。
Behav Res Methods. 2012 Sep;44(3):622-33. doi: 10.3758/s13428-012-0214-0.

本文引用的文献

1
Autonomous chemical research with large language models.大语言模型驱动的自主化学研究。
Nature. 2023 Dec;624(7992):570-578. doi: 10.1038/s41586-023-06792-0. Epub 2023 Dec 20.
2
Challenges of remote assessment in higher education in the context of COVID-19: a case study of Middle East College.新冠疫情背景下高等教育远程评估面临的挑战:以中东学院为例
Educ Assess Eval Account. 2020;32(4):519-535. doi: 10.1007/s11092-020-09340-w. Epub 2020 Oct 21.