School of Humanities, Fujian University of Technology, Fuzhou, Fujian, China.
Fujian Key Laboratory of Intelligent Machining Technology and Equipment, Fujian University of Technology, Fuzhou, Fujian, China.
PLoS One. 2024 Sep 4;19(9):e0309717. doi: 10.1371/journal.pone.0309717. eCollection 2024.
Two-characters continuation, which is a string with two characters emerging in linear sequence, can break through the encapsulation and independence of long solidified language chunks (words and phrases). In this way, two-characters continuation can measure the information of not only static language units (words and phrases) but also their combination in the text. Therefore, two-characters continuation is used as a measurement unit for investigating Chinese text's difficulty, to enhance the accuracy of measuring text's difficulty. Three different measurement methods of text's difficulty are proposed, which are respectively based on "continuation index of character", "new and stable two-characters continuation" and "emerging tendency of two-characters continuation". The results show that compared to other two methods, the measurement method of text's difficulty based on new and stable two-characters continuations has better effectiveness, whose accuracies for measuring text's difficulty with 6 levels, 3 levels and 2 levels difficulties can reach 36.4%, 64.6% and 79.6%, respectively. In addition, compared to Jiang and Wu's research works, the above measurement method also shows a better effectiveness.
双字符延续,即线性顺序出现的两个字符的字符串,可以突破长固化语言块(单词和短语)的封装和独立性。这样,双字符延续不仅可以衡量静态语言单位(单词和短语)的信息,还可以衡量它们在文本中的组合信息。因此,双字符延续被用作测量中文文本难度的测量单位,以提高测量文本难度的准确性。本文提出了三种不同的文本难度测量方法,分别基于“字符延续指数”、“新的和稳定的双字符延续”和“双字符延续的出现趋势”。结果表明,与其他两种方法相比,基于新的和稳定的双字符延续的文本难度测量方法具有更好的效果,其对 6 级、3 级和 2 级难度文本的测量准确率分别达到 36.4%、64.6%和 79.6%。此外,与蒋和吴的研究工作相比,上述测量方法也显示出更好的效果。