• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于食用菌中文命名实体识别的改进型XLNet模型

Improved XLNet modeling for Chinese named entity recognition of edible fungus.

作者信息

Yu Helong, Wang Chenxi, Xue Mingxuan

机构信息

College of Information Technology, Jilin Agricultural University, Changchun, China.

出版信息

Front Plant Sci. 2024 Jun 25;15:1368847. doi: 10.3389/fpls.2024.1368847. eCollection 2024.

DOI:10.3389/fpls.2024.1368847
PMID:38984153
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11232502/
Abstract

INTRODUCTION

The diversity of edible fungus species and the extent of mycological knowledge pose significant challenges to the research, cultivation, and popularization of edible fungus. To tackle this challenge, there is an urgent need for a rapid and accurate method of acquiring relevant information. The emergence of question and answer (Q&A) systems has the potential to solve this problem. Named entity recognition (NER) provides the basis for building an intelligent Q&A system for edible fungus. In the field of edible fungus, there is a lack of a publicly available Chinese corpus suitable for use in NER, and conventional methods struggle to capture long-distance dependencies in the NER process.

METHODS

This paper describes the establishment of a Chinese corpus in the field of edible fungus and introduces an NER method for edible fungus information based on XLNet and conditional random fields (CRFs). Our approach combines an iterated dilated convolutional neural network (IDCNN) with a CRF. First, leveraging the XLNet model as the foundation, an IDCNN layer is introduced. This layer addresses the limited capacity to capture features across utterances by extending the receptive field of the convolutional kernel. The output of the IDCNN layer is input to the CRF layer, which mitigates any labeling logic errors, resulting in the globally optimal labels for the NER task relating to edible fungus.

RESULTS

Experimental results show that the precision achieved by the proposed model reaches 0.971, with a recall of 0.986 and an F1-score of 0.979.

DISCUSSION

The proposed model outperforms existing approaches in terms of these evaluation metrics, effectively recognizing entities related to edible fungus information and offering methodological support for the construction of knowledge graphs.

摘要

引言

食用菌种类的多样性以及真菌学知识的广度给食用菌的研究、栽培和推广带来了重大挑战。为应对这一挑战,迫切需要一种快速准确获取相关信息的方法。问答(Q&A)系统的出现有潜力解决这个问题。命名实体识别(NER)为构建食用菌智能问答系统提供了基础。在食用菌领域,缺乏适用于NER的公开可用中文语料库,并且传统方法在NER过程中难以捕捉长距离依赖关系。

方法

本文描述了食用菌领域中文语料库的建立,并介绍了一种基于XLNet和条件随机场(CRF)的食用菌信息NER方法。我们的方法将迭代扩张卷积神经网络(IDCNN)与CRF相结合。首先,以XLNet模型为基础,引入IDCNN层。该层通过扩展卷积核的感受野来解决跨语句捕捉特征能力有限的问题。IDCNN层的输出输入到CRF层,该层减轻了任何标注逻辑错误,从而得到与食用菌相关的NER任务的全局最优标注。

结果

实验结果表明,所提出模型的精确率达到0.971,召回率为0.986,F1分数为0.979。

讨论

在所提出的模型在这些评估指标方面优于现有方法,有效地识别了与食用菌信息相关的实体,并为知识图谱的构建提供了方法支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/4c00784594af/fpls-15-1368847-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/2f7e3ffe190b/fpls-15-1368847-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/ef96374f0238/fpls-15-1368847-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/b9b2974829a6/fpls-15-1368847-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/97c951497831/fpls-15-1368847-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/092adf6b03e5/fpls-15-1368847-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/d62b42a02847/fpls-15-1368847-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/b68eb42ac88a/fpls-15-1368847-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/4c00784594af/fpls-15-1368847-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/2f7e3ffe190b/fpls-15-1368847-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/ef96374f0238/fpls-15-1368847-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/b9b2974829a6/fpls-15-1368847-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/97c951497831/fpls-15-1368847-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/092adf6b03e5/fpls-15-1368847-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/d62b42a02847/fpls-15-1368847-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/b68eb42ac88a/fpls-15-1368847-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5045/11232502/4c00784594af/fpls-15-1368847-g008.jpg

相似文献

1
Improved XLNet modeling for Chinese named entity recognition of edible fungus.用于食用菌中文命名实体识别的改进型XLNet模型
Front Plant Sci. 2024 Jun 25;15:1368847. doi: 10.3389/fpls.2024.1368847. eCollection 2024.
2
Research on named entity recognition of Traditional Chinese Medicine chest discomfort cases incorporating domain vocabulary features.基于领域词汇特征的中医胸痹病案命名实体识别研究。
Comput Biol Med. 2023 Nov;166:107466. doi: 10.1016/j.compbiomed.2023.107466. Epub 2023 Sep 9.
3
Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。
J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.
4
An imConvNet-based deep learning model for Chinese medical named entity recognition.基于 imConvNet 的深度学习模型在中文医疗命名实体识别中的应用。
BMC Med Inform Decis Mak. 2022 Nov 21;22(1):303. doi: 10.1186/s12911-022-02049-4.
5
A multi-layer soft lattice based model for Chinese clinical named entity recognition.基于多层软晶格的中文临床命名实体识别模型。
BMC Med Inform Decis Mak. 2022 Jul 30;22(1):201. doi: 10.1186/s12911-022-01924-4.
6
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
7
A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.
8
Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition.诱导前条件随机场:通过诱导连接独立实体以提高临床命名实体识别。
BMC Med Inform Decis Mak. 2019 Jul 15;19(1):132. doi: 10.1186/s12911-019-0865-1.
9
Research on named entity recognition of adverse drug reactions based on NLP and deep learning.基于自然语言处理和深度学习的药物不良反应命名实体识别研究
Front Pharmacol. 2023 Jun 1;14:1121796. doi: 10.3389/fphar.2023.1121796. eCollection 2023.
10
Clinical Named Entity Recognition From Chinese Electronic Health Records via Machine Learning Methods.基于机器学习方法的中文电子健康记录临床命名实体识别
JMIR Med Inform. 2018 Dec 17;6(4):e50. doi: 10.2196/medinform.9965.

本文引用的文献

1
Edible mushroom industry in China: current state and perspectives.中国食药用菌产业:现状与展望。
Appl Microbiol Biotechnol. 2022 Jun;106(11):3949-3955. doi: 10.1007/s00253-022-11985-0. Epub 2022 May 27.
2
Potential Usage of Edible Mushrooms and Their Residues to Retrieve Valuable Supplies for Industrial Applications.食用菌及其残渣在工业应用中回收宝贵资源的潜在用途。
J Fungi (Basel). 2021 May 28;7(6):427. doi: 10.3390/jof7060427.
3
Chinese medical named entity recognition based on multi-granularity semantic dictionary and multimodal tree.
基于多粒度语义词典和多模态树的中文医学命名实体识别。
J Biomed Inform. 2020 Nov;111:103583. doi: 10.1016/j.jbi.2020.103583. Epub 2020 Sep 30.
4
Towards Chinese clinical named entity recognition by dynamic embedding using domain-specific knowledge.通过使用领域特定知识的动态嵌入实现中文临床命名实体识别
J Biomed Inform. 2020 Jun;106:103435. doi: 10.1016/j.jbi.2020.103435. Epub 2020 Apr 29.
5
Named-Entity-Recognition-Based Automated System for Diagnosing Cybersecurity Situations in IoT Networks.基于命名实体识别的物联网网络网络安全态势诊断自动化系统。
Sensors (Basel). 2019 Aug 1;19(15):3380. doi: 10.3390/s19153380.
6
Growing edible mushrooms: a conversation between bacteria and fungi.种植可食用蘑菇:细菌与真菌的对话。
Environ Microbiol. 2020 Mar;22(3):858-872. doi: 10.1111/1462-2920.14765. Epub 2019 Sep 9.
7
A hybrid approach for named entity recognition in Chinese electronic medical record.中文电子病历命名实体识别的混合方法。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):64. doi: 10.1186/s12911-019-0767-2.
8
A Novel Approach towards Medical Entity Recognition in Chinese Clinical Text.中文临床文本中医疗实体识别的新方法。
J Healthc Eng. 2017;2017:4898963. doi: 10.1155/2017/4898963. Epub 2017 Jul 5.