Suppr超能文献

基于二字接续的中文文本难度测量方法研究。

Measurement method research of Chinese texts' difficulty based on two-characters continuations.

机构信息

School of Humanities, Fujian University of Technology, Fuzhou, Fujian, China.

Fujian Key Laboratory of Intelligent Machining Technology and Equipment, Fujian University of Technology, Fuzhou, Fujian, China.

出版信息

PLoS One. 2024 Sep 4;19(9):e0309717. doi: 10.1371/journal.pone.0309717. eCollection 2024.

Abstract

Two-characters continuation, which is a string with two characters emerging in linear sequence, can break through the encapsulation and independence of long solidified language chunks (words and phrases). In this way, two-characters continuation can measure the information of not only static language units (words and phrases) but also their combination in the text. Therefore, two-characters continuation is used as a measurement unit for investigating Chinese text's difficulty, to enhance the accuracy of measuring text's difficulty. Three different measurement methods of text's difficulty are proposed, which are respectively based on "continuation index of character", "new and stable two-characters continuation" and "emerging tendency of two-characters continuation". The results show that compared to other two methods, the measurement method of text's difficulty based on new and stable two-characters continuations has better effectiveness, whose accuracies for measuring text's difficulty with 6 levels, 3 levels and 2 levels difficulties can reach 36.4%, 64.6% and 79.6%, respectively. In addition, compared to Jiang and Wu's research works, the above measurement method also shows a better effectiveness.

摘要

双字符延续,即线性顺序出现的两个字符的字符串,可以突破长固化语言块(单词和短语)的封装和独立性。这样,双字符延续不仅可以衡量静态语言单位(单词和短语)的信息,还可以衡量它们在文本中的组合信息。因此,双字符延续被用作测量中文文本难度的测量单位,以提高测量文本难度的准确性。本文提出了三种不同的文本难度测量方法,分别基于“字符延续指数”、“新的和稳定的双字符延续”和“双字符延续的出现趋势”。结果表明,与其他两种方法相比,基于新的和稳定的双字符延续的文本难度测量方法具有更好的效果,其对 6 级、3 级和 2 级难度文本的测量准确率分别达到 36.4%、64.6%和 79.6%。此外,与蒋和吴的研究工作相比,上述测量方法也显示出更好的效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75ce/11373855/644e2b95267c/pone.0309717.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验