• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用深度神经网络改进阿姆哈拉语的词性标注

Improving part-of-speech tagging in Amharic language using deep neural network.

作者信息

Hirpassa Sintayehu, Lehal G S

机构信息

Department of Computer Science, Adama Science and Technology University, Ethiopia.

Department of Computer Science, Punjabi University, India.

出版信息

Heliyon. 2023 Jun 21;9(7):e17175. doi: 10.1016/j.heliyon.2023.e17175. eCollection 2023 Jul.

DOI:10.1016/j.heliyon.2023.e17175
PMID:37539248
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10394909/
Abstract

To date, several POS taggers have been introduced to facilitate the success of semantic analysis for different languages. However, the task of POS tagging becomes a bit intricate in morphologically complex languages, like Amharic. In this paper, we evaluated different models such as bidirectional long short term memory, convolutional neural network in combination with bidirectional long short term memory, and conditional random field for Amharic POS tagging. Various features, both language-dependent and -independent, have been explored in a conditional random field model. Besides, word-level and character-level features are analyzed in deep neural network models. A convolutional neural network is utilized for encoding features at the word and character level. Each model's performance has evaluated on the dataset that contained 321 K tokens and manually tagged with 31 POS tags. Lastly, the best performance obtained by an end-to-end deep neural network model, convolutional neural network in combination with bidirectional long term short memory and conditional random field, is 97.23% accuracy. This is the highest accuracy for Amharic POS tagging task and is competent with contemporary taggers currently existing in different languages.

摘要

迄今为止,已经引入了几种词性标注器来促进不同语言语义分析的成功。然而,在形态复杂的语言(如阿姆哈拉语)中,词性标注任务变得有点复杂。在本文中,我们评估了不同的模型,如双向长短期记忆模型、结合双向长短期记忆的卷积神经网络以及用于阿姆哈拉语词性标注的条件随机场。在条件随机场模型中探索了各种与语言相关和无关的特征。此外,在深度神经网络模型中分析了单词级和字符级特征。利用卷积神经网络对单词和字符级别的特征进行编码。每个模型的性能都在包含32.1万个词元且用31个词性标签手动标注的数据集上进行了评估。最后,一个端到端的深度神经网络模型(结合双向长短期记忆的卷积神经网络和条件随机场)获得的最佳性能是准确率为97.23%。这是阿姆哈拉语词性标注任务的最高准确率,并且与目前不同语言中现有的当代标注器相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/e2b7beaac44b/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/2d7dea9d8bc1/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/1ca5070b4d76/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/c107970da940/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/e2b7beaac44b/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/2d7dea9d8bc1/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/1ca5070b4d76/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/c107970da940/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67f3/10394909/e2b7beaac44b/gr5.jpg

相似文献

1
Improving part-of-speech tagging in Amharic language using deep neural network.使用深度神经网络改进阿姆哈拉语的词性标注
Heliyon. 2023 Jun 21;9(7):e17175. doi: 10.1016/j.heliyon.2023.e17175. eCollection 2023 Jul.
2
Deep learning-based idiomatic expression recognition for the Amharic language.基于深度学习的阿姆哈拉语惯用表达识别。
PLoS One. 2023 Dec 14;18(12):e0295339. doi: 10.1371/journal.pone.0295339. eCollection 2023.
3
A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.
4
A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text.一个用于临床文本的细粒度中文分词和词性标注语料库。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):66. doi: 10.1186/s12911-019-0770-7.
5
Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning.用于泰语特殊疑问句分类的深度学习自然语言处理词性标注增强
Heliyon. 2021 Oct 19;7(10):e08216. doi: 10.1016/j.heliyon.2021.e08216. eCollection 2021 Oct.
6
Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.基于多语义特征,利用经过稳健优化的基于变换器预训练方法的全词掩码和卷积神经网络从电子病历中进行中文临床命名实体识别:模型开发与验证
JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.
7
Towards audio-based identification of Ethio-Semitic languages using recurrent neural network.基于循环神经网络的埃塞俄比亚-闪米特语语音识别。
Sci Rep. 2023 Nov 7;13(1):19346. doi: 10.1038/s41598-023-46646-3.
8
A Data-Driven Model for Automated Chinese Word Segmentation and POS Tagging.基于数据驱动的中文分词与词性标注自动化模型
Comput Intell Neurosci. 2022 Sep 16;2022:7622392. doi: 10.1155/2022/7622392. eCollection 2022.
9
Multilingual part-of-speech tagging with weightless neural networks.使用无权重神经网络进行多语言词性标注。
Neural Netw. 2015 Jun;66:11-21. doi: 10.1016/j.neunet.2015.02.012. Epub 2015 Mar 2.
10
Aspect extraction on user textual reviews using multi-channel convolutional neural network.基于多通道卷积神经网络的用户文本评论方面提取
PeerJ Comput Sci. 2019 May 6;5:e191. doi: 10.7717/peerj-cs.191. eCollection 2019.

本文引用的文献

1
From POS tagging to dependency parsing for biomedical event extraction.从词性标注到生物医学事件抽取的依存句法分析。
BMC Bioinformatics. 2019 Feb 12;20(1):72. doi: 10.1186/s12859-019-2604-0.