• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

中文BERT学到了哪些句法知识?

What does Chinese BERT learn about syntactic knowledge?

作者信息

Zheng Jianyu, Liu Ying

机构信息

Department of Chinese Language and Literature, Tsinghua University, Haidian Distinct, Beijing, China.

出版信息

PeerJ Comput Sci. 2023 Jul 26;9:e1478. doi: 10.7717/peerj-cs.1478. eCollection 2023.

DOI:10.7717/peerj-cs.1478
PMID:37547407
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10403162/
Abstract

Pre-trained language models such as Bidirectional Encoder Representations from Transformers (BERT) have been applied to a wide range of natural language processing (NLP) tasks and obtained significantly positive results. A growing body of research has investigated the reason why BERT is so efficient and what language knowledge BERT is able to learn. However, most of these works focused almost exclusively on English. Few studies have explored the language information, particularly syntactic information, that BERT has learned in Chinese, which is written as sequences of characters. In this study, we adopted some probing methods for identifying syntactic knowledge stored in the attention heads and hidden states of Chinese BERT. The results suggest that some individual heads and combination of heads do well in encoding corresponding and overall syntactic relations, respectively. The hidden representation of each layer also contained syntactic information to different degrees. We also analyzed the fine-tuned models of Chinese BERT for different tasks, covering all levels. Our results suggest that these fine-turned models reflect changes in conserving language structure. These findings help explain why Chinese BERT can show such large improvements across many language-processing tasks.

摘要

预训练语言模型,如来自Transformer的双向编码器表征(BERT),已被应用于广泛的自然语言处理(NLP)任务,并取得了显著的积极成果。越来越多的研究探讨了BERT为何如此高效以及它能够学习何种语言知识。然而,这些研究大多几乎只聚焦于英语。很少有研究探索BERT在中文(以字符序列书写)中学习到的语言信息,尤其是句法信息。在本研究中,我们采用了一些探测方法来识别存储在中文BERT注意力头和隐藏状态中的句法知识。结果表明,一些单个注意力头和注意力头的组合分别在编码相应的和整体的句法关系方面表现良好。每一层的隐藏表征也不同程度地包含句法信息。我们还分析了中文BERT针对不同任务的微调模型,涵盖了各个层面。我们的结果表明,这些微调模型反映了在保留语言结构方面的变化。这些发现有助于解释为什么中文BERT在许多语言处理任务中能有如此大的提升。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/2cf3b51cb219/peerj-cs-09-1478-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/deb8f26d65e5/peerj-cs-09-1478-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/d82daaa230ee/peerj-cs-09-1478-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/e0f83a76c984/peerj-cs-09-1478-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/009ed9fee529/peerj-cs-09-1478-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/859e6b11f954/peerj-cs-09-1478-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/f1e56d485d9e/peerj-cs-09-1478-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/f3ee38440530/peerj-cs-09-1478-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/389e448df03a/peerj-cs-09-1478-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/621e797ea039/peerj-cs-09-1478-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/2cf3b51cb219/peerj-cs-09-1478-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/deb8f26d65e5/peerj-cs-09-1478-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/d82daaa230ee/peerj-cs-09-1478-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/e0f83a76c984/peerj-cs-09-1478-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/009ed9fee529/peerj-cs-09-1478-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/859e6b11f954/peerj-cs-09-1478-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/f1e56d485d9e/peerj-cs-09-1478-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/f3ee38440530/peerj-cs-09-1478-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/389e448df03a/peerj-cs-09-1478-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/621e797ea039/peerj-cs-09-1478-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a6/10403162/2cf3b51cb219/peerj-cs-09-1478-g010.jpg

相似文献

1
What does Chinese BERT learn about syntactic knowledge?中文BERT学到了哪些句法知识?
PeerJ Comput Sci. 2023 Jul 26;9:e1478. doi: 10.7717/peerj-cs.1478. eCollection 2023.
2
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
3
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN(带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合)模型的医患对话多标签分类:命名实体研究
JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.
4
Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.基于 BERT(来自 Transformers 的双向编码器表示)的深度学习方法在提取中文放射学报告证据中的应用:计算机辅助肝癌诊断框架的开发。
J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.
5
Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study.基于大规模电子健康记录笔记对基于变换器的双向编码器表征(BERT)模型进行微调:一项实证研究。
JMIR Med Inform. 2019 Sep 12;7(3):e14830. doi: 10.2196/14830.
6
GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models.GT-Finder:使用预训练的 BERT 语言模型对葡萄糖转运蛋白家族进行分类。
Comput Biol Med. 2021 Apr;131:104259. doi: 10.1016/j.compbiomed.2021.104259. Epub 2021 Feb 7.
7
When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.当 BERT 遇见比尔博:预训练语言模型在疾病分类上的学习曲线分析。
BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.
8
BioBERT and Similar Approaches for Relation Extraction.BioBERT 及其在关系抽取中的应用。
Methods Mol Biol. 2022;2496:221-235. doi: 10.1007/978-1-0716-2305-3_12.
9
Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?句法树是否能增强用于化学药物关系抽取的基于转换器的双向编码器表示(BERT)模型?
Database (Oxford). 2022 Aug 25;2022. doi: 10.1093/database/baac070.
10
Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.在FDA药品标签中微调BERT以进行自动ADME语义标注,以加强特定产品的指导评估。
J Biomed Inform. 2023 Feb;138:104285. doi: 10.1016/j.jbi.2023.104285. Epub 2023 Jan 9.

本文引用的文献

1
The semantic processing of syntactic structure in sentence comprehension: an ERP study.句子理解中句法结构的语义加工:一项事件相关电位研究
Brain Res. 2007 Apr 20;1142:135-45. doi: 10.1016/j.brainres.2007.01.030. Epub 2007 Jan 18.