School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St., Haidian District, Beijing 100875, China.
Sensors (Basel). 2022 Jul 3;22(13):5024. doi: 10.3390/s22135024.
In the process of semantic capture, traditional sentence representation methods tend to lose a lot of global and contextual semantics and ignore the internal structure information of words in sentences. To address these limitations, we propose a sentence representation method for character-assisted construction-Bert (CharAs-CBert) to improve the accuracy of sentiment text classification. First, based on the construction, a more effective construction vector is generated to distinguish the basic morphology of the sentence and reduce the ambiguity of the same word in different sentences. At the same time, it aims to strengthen the representation of salient words and effectively capture contextual semantics. Second, character feature vectors are introduced to explore the internal structure information of sentences and improve the representation ability of local and global semantics. Then, to make the sentence representation have better stability and robustness, character information, word information, and construction vectors are combined and used together for sentence representation. Finally, the evaluation and verification are carried out on various open-source baseline data such as ACL-14 and SemEval 2014 to demonstrate the validity and reliability of sentence representation, namely, the and ACC are 87.54% and 92.88% on ACL14, respectively.
在语义捕捉过程中,传统的句子表示方法往往会丢失大量的全局和上下文语义,并且忽略句子中单词的内部结构信息。为了解决这些局限性,我们提出了一种基于字符辅助构建的句子表示方法-Bert(CharAs-CBert),以提高情感文本分类的准确性。首先,基于构建,生成更有效的构建向量,以区分句子的基本形态并减少不同句子中相同单词的歧义。同时,旨在增强突出词的表示,并有效地捕捉上下文语义。其次,引入字符特征向量来探索句子的内部结构信息,提高局部和全局语义的表示能力。然后,为了使句子表示具有更好的稳定性和鲁棒性,将字符信息、单词信息和构建向量相结合并一起用于句子表示。最后,在 ACL-14 和 SemEval 2014 等各种开源基准数据上进行评估和验证,以证明句子表示的有效性和可靠性,即在 ACL14 上的和 ACC 分别为 87.54%和 92.88%。