基于融合多特征的集成双向长短期记忆网络-文本卷积神经网络模型的保险问答社区文本匹配

Li Zhaohui, Yang Xueru, Zhou Luli, Jia Hongyu, Li Wenli

School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China.

School of Economics and Management, Dalian University of Technology, Dalian 116024, China.

Entropy (Basel). 2023 Apr 10;25(4):639. doi: 10.3390/e25040639.

Along with the explosion of ChatGPT, the artificial intelligence question-answering system has been pushed to a climax. Intelligent question-answering enables computers to simulate people's behavior habits of understanding a corpus through machine learning, so as to answer questions in professional fields. How to obtain more accurate answers to personalized questions in professional fields is the core content of intelligent question-answering research. As one of the key technologies of intelligent question-answering, the accuracy of text matching is related to the development of the intelligent question-answering community. Aiming to solve the problem of polysemy of text, the Enhanced Representation through Knowledge Integration (ERNIE) model is used to obtain the word vector representation of text, which makes up for the lack of prior knowledge in the traditional word vector representation model. Additionally, there are also problems of homophones and polyphones in Chinese, so this paper introduces the phonetic character sequence of the text to distinguish them. In addition, aiming at the problem that there are many proper nouns in the insurance field that are difficult to identify, after conventional part-of-speech tagging, proper nouns are distinguished by especially defining their parts of speech. After the above three types of text-based semantic feature extensions, this paper also uses the Bi-directional Long Short-Term Memory (BiLSTM) and TextCNN models to extract the global features and local features of the text, respectively. It can obtain the feature representation of the text more comprehensively. Thus, the text matching model integrating BiLSTM and TextCNN fusing Multi-Feature (namely MFBT) is proposed for the insurance question-answering community. The MFBT model aims to solve the problems that affect the answer selection in the insurance question-answering community, such as proper nouns, nonstandard sentences and sparse features. Taking the question-and-answer data of the insurance library as the sample, the MFBT text-matching model is compared and evaluated with other models. The experimental results show that the MFBT text-matching model has higher evaluation index values, including accuracy, recall and F1, than other models. The model trained by historical search data can better help users in the insurance question-and-answer community obtain the answers they need and improve their satisfaction.

随着ChatGPT的爆火，人工智能问答系统被推向了高潮。智能问答使计算机能够通过机器学习模拟人们理解语料库的行为习惯，从而回答专业领域的问题。如何在专业领域中获得更准确的个性化问题答案是智能问答研究的核心内容。作为智能问答的关键技术之一，文本匹配的准确性关系到智能问答社区的发展。为了解决文本一词多义的问题，采用知识融合增强表示（ERNIE）模型来获取文本的词向量表示，弥补了传统词向量表示模型中先验知识的不足。此外，中文中还存在同音词和多音字的问题，因此本文引入文本的拼音字符序列来区分它们。另外，针对保险领域中存在许多难以识别的专有名词的问题，在进行常规词性标注后，通过特别定义专有名词的词性来加以区分。经过上述三种基于文本的语义特征扩展后，本文还使用双向长短期记忆（BiLSTM）和TextCNN模型分别提取文本的全局特征和局部特征。这样可以更全面地获得文本的特征表示。因此，针对保险问答社区提出了融合多特征的BiLSTM和TextCNN文本匹配模型（即MFBT）。MFBT模型旨在解决影响保险问答社区答案选择的问题，如专有名词、非标准句子和稀疏特征等。以保险库的问答数据为样本，将MFBT文本匹配模型与其他模型进行比较和评估。实验结果表明，MFBT文本匹配模型在准确率、召回率和F1等评估指标值上均高于其他模型。由历史搜索数据训练的该模型能够更好地帮助保险问答社区的用户获得他们需要的答案，提高用户满意度。

相似文献

Text Matching in Insurance Question-Answering Community Based on an Integrated BiLSTM-TextCNN Model Fusing Multi-Feature.

Entropy (Basel). 2023 Apr 10;25(4):639. doi: 10.3390/e25040639.

Using Semantic Text Similarity calculation for question matching in a rheumatoid arthritis question-answering system.

Quant Imaging Med Surg. 2023 Apr 1;13(4):2183-2196. doi: 10.21037/qims-22-749. Epub 2023 Mar 15.

Research on sentiment classification for netizens based on the BERT-BiLSTM-TextCNN model.

PeerJ Comput Sci. 2022 Jun 8;8:e1005. doi: 10.7717/peerj-cs.1005. eCollection 2022.

A Stacked BiLSTM Neural Network Based on Coattention Mechanism for Question Answering.

Comput Intell Neurosci. 2019 Aug 21;2019:9543490. doi: 10.1155/2019/9543490. eCollection 2019.

DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning.

Comput Biol Med. 2024 Mar;170:107921. doi: 10.1016/j.compbiomed.2024.107921. Epub 2024 Jan 4.

MAGE: Multi-scale Context-aware Interaction based on Multi-granularity Embedding for Chinese Medical Question Answer Matching.

Comput Methods Programs Biomed. 2023 Jan;228:107249. doi: 10.1016/j.cmpb.2022.107249. Epub 2022 Nov 17.

A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis.

J Comb Optim. 2023;45(4):109. doi: 10.1007/s10878-023-01038-1. Epub 2023 May 11.

CapsTM: capsule network for Chinese medical text matching.

BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.

Biomedical named entity recognition based on fusion multi-features embedding.

Technol Health Care. 2023;31(S1):111-121. doi: 10.3233/THC-236011.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

FinKENet: A Novel Financial Knowledge Enhanced Network for Financial Question Matching.

Entropy (Basel). 2023 Dec 26;26(1):26. doi: 10.3390/e26010026.

本文引用的文献

Double attention recurrent convolution neural network for answer selection.

R Soc Open Sci. 2020 May 20;7(5):191517. doi: 10.1098/rsos.191517. eCollection 2020 May.

Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

Neural Netw. 2005 Jun-Jul;18(5-6):602-10. doi: 10.1016/j.neunet.2005.06.042.

Long short-term memory.

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Text Matching in Insurance Question-Answering Community Based on an Integrated BiLSTM-TextCNN Model Fusing Multi-Feature.

Entropy (Basel). 2023 Apr 10;25(4):639. doi: 10.3390/e25040639.

Using Semantic Text Similarity calculation for question matching in a rheumatoid arthritis question-answering system.

Quant Imaging Med Surg. 2023 Apr 1;13(4):2183-2196. doi: 10.21037/qims-22-749. Epub 2023 Mar 15.

Research on sentiment classification for netizens based on the BERT-BiLSTM-TextCNN model.

PeerJ Comput Sci. 2022 Jun 8;8:e1005. doi: 10.7717/peerj-cs.1005. eCollection 2022.

A Stacked BiLSTM Neural Network Based on Coattention Mechanism for Question Answering.

Comput Intell Neurosci. 2019 Aug 21;2019:9543490. doi: 10.1155/2019/9543490. eCollection 2019.

DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning.

Comput Biol Med. 2024 Mar;170:107921. doi: 10.1016/j.compbiomed.2024.107921. Epub 2024 Jan 4.

MAGE: Multi-scale Context-aware Interaction based on Multi-granularity Embedding for Chinese Medical Question Answer Matching.

Comput Methods Programs Biomed. 2023 Jan;228:107249. doi: 10.1016/j.cmpb.2022.107249. Epub 2022 Nov 17.

A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis.

J Comb Optim. 2023;45(4):109. doi: 10.1007/s10878-023-01038-1. Epub 2023 May 11.

CapsTM: capsule network for Chinese medical text matching.

BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.

Biomedical named entity recognition based on fusion multi-features embedding.

Technol Health Care. 2023;31(S1):111-121. doi: 10.3233/THC-236011.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

FinKENet: A Novel Financial Knowledge Enhanced Network for Financial Question Matching.

Entropy (Basel). 2023 Dec 26;26(1):26. doi: 10.3390/e26010026.

本文引用的文献

Double attention recurrent convolution neural network for answer selection.

R Soc Open Sci. 2020 May 20;7(5):191517. doi: 10.1098/rsos.191517. eCollection 2020 May.

Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

Neural Netw. 2005 Jun-Jul;18(5-6):602-10. doi: 10.1016/j.neunet.2005.06.042.

Long short-term memory.

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

Text Matching in Insurance Question-Answering Community Based on an Integrated BiLSTM-TextCNN Model Fusing Multi-Feature.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献