• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 LDA 和 word2vec 的英语离题作文检测

Application of LDA and word2vec to detect English off-topic composition.

机构信息

College of Culture and Art, Zhejiang Technical Institute of Economics, Hangzhou, Zhejiang, China.

College of Foreign Languages, Zhejiang University of Technology, Hangzhou, Zhejiang, China.

出版信息

PLoS One. 2022 Feb 25;17(2):e0264552. doi: 10.1371/journal.pone.0264552. eCollection 2022.

DOI:10.1371/journal.pone.0264552
PMID:35213641
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8880936/
Abstract

This paper presents an off-topic detection algorithm combining LDA and word2vec aiming at the problem of the lack of accurate and efficient off-topic detection algorithms in the English composition-assisted review system. The algorithm uses the LDA model to model the document and train the document through the word2vec, and uses the semantic relationship between the document’s topics and words to calculate the probability weighted sum for each topic and its feature words in the document, and finally selects the off-topic composition by setting a reasonable threshold. Different F values are obtained by changing the number of topics in the document, and the best number of topics is determined. Experimental results show that the proposed method is more effective than vector space model, can detect more off-topic compositions, and the accuracy is higher, the F value is more than 88%, which realizes the intelligent processing of off-topic detection of composition, and can be effectively applied in English composition teaching.

摘要

本文提出了一种结合 LDA 和 word2vec 的离题检测算法,旨在解决英语作文辅助评阅系统中缺乏准确高效的离题检测算法的问题。该算法使用 LDA 模型对文档进行建模,并通过 word2vec 对文档进行训练,利用文档主题和单词之间的语义关系,计算文档中每个主题及其特征词的概率加权和,最后通过设置合理的阈值选择离题作文。通过改变文档中的主题数量得到不同的 F 值,并确定最佳的主题数量。实验结果表明,该方法比向量空间模型更有效,可以检测到更多的离题作文,准确率更高,F 值超过 88%,实现了作文离题检测的智能化处理,可有效应用于英语作文教学中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/079c904ee5c0/pone.0264552.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/434fdadc537d/pone.0264552.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/f55f97a39434/pone.0264552.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/016794281d53/pone.0264552.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/a3e4542775a0/pone.0264552.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/079c904ee5c0/pone.0264552.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/434fdadc537d/pone.0264552.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/f55f97a39434/pone.0264552.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/016794281d53/pone.0264552.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/a3e4542775a0/pone.0264552.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d885/8880936/079c904ee5c0/pone.0264552.g005.jpg

相似文献

1
Application of LDA and word2vec to detect English off-topic composition.基于 LDA 和 word2vec 的英语离题作文检测
PLoS One. 2022 Feb 25;17(2):e0264552. doi: 10.1371/journal.pone.0264552. eCollection 2022.
2
Retraction: Application of LDA and word2vec to detect English off-topic composition.撤回:应用LDA和词向量模型检测英语离题作文
PLoS One. 2023 Mar 15;18(3):e0283315. doi: 10.1371/journal.pone.0283315. eCollection 2023.
3
AI-based disease category prediction model using symptoms from low-resource Ethiopian language: Afaan Oromo text.基于人工智能的疾病类别预测模型,利用来自资源匮乏的埃塞俄比亚语言(阿法尔语)的症状文本。
Sci Rep. 2024 May 16;14(1):11233. doi: 10.1038/s41598-024-62278-7.
4
Correction: Application of LDA and word2vec to detect English off-topic composition.更正:应用LDA和词向量模型检测英语离题作文。
PLoS One. 2024 Oct 21;19(10):e0312710. doi: 10.1371/journal.pone.0312710. eCollection 2024.
5
Word2Vec inversion and traditional text classifiers for phenotyping lupus.用于狼疮表型分析的词向量反演和传统文本分类器
BMC Med Inform Decis Mak. 2017 Aug 22;17(1):126. doi: 10.1186/s12911-017-0518-1.
6
Deep learning for religious and continent-based toxic content detection and classification.深度学习在宗教和地域相关有害内容检测与分类中的应用。
Sci Rep. 2022 Oct 19;12(1):17478. doi: 10.1038/s41598-022-22523-3.
7
Improving the Polarity of Text through word2vec Embedding for Primary Classical Arabic Sentiment Analysis.通过词向量嵌入提高文本极性用于初级古典阿拉伯语情感分析
Neural Process Lett. 2023 Jan 23:1-16. doi: 10.1007/s11063-022-11111-1.
8
Facial Expression Recognition Based on LDA Feature Space Optimization.基于 LDA 特征空间优化的面部表情识别。
Comput Intell Neurosci. 2022 Aug 29;2022:9521329. doi: 10.1155/2022/9521329. eCollection 2022.
9
Comparison of deep learning models for natural language processing-based classification of non-English head CT reports.基于深度学习的自然语言处理的非英语头部 CT 报告分类的比较。
Neuroradiology. 2020 Oct;62(10):1247-1256. doi: 10.1007/s00234-020-02420-0. Epub 2020 Apr 25.
10
Deep Learning in Population Genetics.群体遗传学中的深度学习。
Genome Biol Evol. 2023 Feb 3;15(2). doi: 10.1093/gbe/evad008.

引用本文的文献

1
The data visualization and intelligent text analysis for effective evaluation of English language teaching.用于有效评估英语教学的数据可视化与智能文本分析
Sci Rep. 2025 Jul 2;15(1):22737. doi: 10.1038/s41598-025-08182-0.
2
Correction: Application of LDA and word2vec to detect English off-topic composition.更正:应用LDA和词向量模型检测英语离题作文。
PLoS One. 2024 Oct 21;19(10):e0312710. doi: 10.1371/journal.pone.0312710. eCollection 2024.
3
Retraction: Application of LDA and word2vec to detect English off-topic composition.
撤回:应用LDA和词向量模型检测英语离题作文
PLoS One. 2023 Mar 15;18(3):e0283315. doi: 10.1371/journal.pone.0283315. eCollection 2023.
4
Predicting the Mortality of ICU Patients by Topic Model with Machine-Learning Techniques.运用机器学习技术的主题模型预测重症监护病房患者的死亡率
Healthcare (Basel). 2022 Jun 11;10(6):1087. doi: 10.3390/healthcare10061087.