• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

神经网络能学习自然语言背后的统计规律吗?

Do neural nets learn statistical laws behind natural language?

作者信息

Takahashi Shuntaro, Tanaka-Ishii Kumiko

机构信息

The University of Tokyo, Graduate School of Frontier Sciences, Chiba 277-8563, Japan.

The University of Tokyo, Research Center for Advanced Science and Technology, Tokyo 153-8904, Japan.

出版信息

PLoS One. 2017 Dec 29;12(12):e0189326. doi: 10.1371/journal.pone.0189326. eCollection 2017.

DOI:10.1371/journal.pone.0189326
PMID:29287076
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5747447/
Abstract

The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.

摘要

深度学习在自然语言处理中的表现十分出色,但其成功原因仍不明晰,因为深度学习本身就很复杂。本文提供了关于其有效性以及神经网络在语言工程方面局限性的实证证据。具体而言,我们证明了基于长短期记忆(LSTM)的神经语言模型能有效地重现齐普夫定律和希普斯定律,这是自然语言背后的两个代表性统计特性。我们讨论了重现性的质量以及随着训练的进行齐普夫定律和希普斯定律的出现情况。我们还指出,神经语言模型在重现自然语言的另一个统计特性——长程相关性方面存在局限性。这种认识可为改进神经网络架构提供方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/c85ce6893e10/pone.0189326.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/07563e5afdc8/pone.0189326.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/3dd47afaab06/pone.0189326.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/1fe11132338f/pone.0189326.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/46e8fb271e4d/pone.0189326.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/4dec77d0cbf4/pone.0189326.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/982abd296fca/pone.0189326.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/c85ce6893e10/pone.0189326.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/07563e5afdc8/pone.0189326.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/3dd47afaab06/pone.0189326.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/1fe11132338f/pone.0189326.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/46e8fb271e4d/pone.0189326.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/4dec77d0cbf4/pone.0189326.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/982abd296fca/pone.0189326.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7921/5747447/c85ce6893e10/pone.0189326.g007.jpg

相似文献

1
Do neural nets learn statistical laws behind natural language?神经网络能学习自然语言背后的统计规律吗?
PLoS One. 2017 Dec 29;12(12):e0189326. doi: 10.1371/journal.pone.0189326. eCollection 2017.
2
Zipf's law leads to Heaps' law: analyzing their relation in finite-size systems.齐夫定律导致海普斯定律:分析有限系统中的它们之间的关系。
PLoS One. 2010 Dec 2;5(12):e14139. doi: 10.1371/journal.pone.0014139.
3
Scaling Laws for Phonotactic Complexity in Spoken English Language Data.英语口语中韵律构词复杂度的标度定律。
Lang Speech. 2021 Sep;64(3):693-704. doi: 10.1177/0023830920944445. Epub 2020 Aug 1.
4
Deviation of Zipf's and Heaps' Laws in human languages with limited dictionary sizes.有限词汇量下人类语言中齐夫定律和赫普定律的偏离。
Sci Rep. 2013;3:1082. doi: 10.1038/srep01082. Epub 2013 Jan 30.
5
Evolution of scaling emergence in large-scale spatial epidemic spreading.大规模空间流行病传播中尺度涌现的进化。
PLoS One. 2011;6(7):e21197. doi: 10.1371/journal.pone.0021197. Epub 2011 Jul 1.
6
Zipf's, Heaps' and Taylor's Laws are Determined by the Expansion into the Adjacent Possible.齐普夫定律、希普斯定律和泰勒定律由向邻接可能态的扩展所决定。
Entropy (Basel). 2018 Sep 30;20(10):752. doi: 10.3390/e20100752.
7
Scaling laws in natural conversations among elderly people.老年人自然会话中的标度律。
PLoS One. 2021 Feb 19;16(2):e0246884. doi: 10.1371/journal.pone.0246884. eCollection 2021.
8
Log-Log Convexity of Type-Token Growth in Zipf's Systems.齐夫系统中 T 型标记增长的对数-对数凸性。
Phys Rev Lett. 2015 Jun 12;114(23):238701. doi: 10.1103/PhysRevLett.114.238701. Epub 2015 Jun 9.
9
Range-limited Heaps' law for functional DNA words in the human genome.人类基因组中功能性DNA单词的范围受限的齐普夫定律
J Theor Biol. 2024 Sep 7;592:111878. doi: 10.1016/j.jtbi.2024.111878. Epub 2024 Jun 18.
10
Zipf's word frequency law in natural language: a critical review and future directions.自然语言中的齐普夫词频定律:批判性综述与未来方向
Psychon Bull Rev. 2014 Oct;21(5):1112-30. doi: 10.3758/s13423-014-0585-6.

引用本文的文献

1
Long-Range Dependence in Word Time Series: The Cosine Correlation of Embeddings.词时间序列中的长程相依性:嵌入的余弦相关性。
Entropy (Basel). 2025 Jun 9;27(6):613. doi: 10.3390/e27060613.

本文引用的文献

1
Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words.文学文本中的长时记忆:论罕见词的普遍聚类
PLoS One. 2016 Nov 28;11(11):e0164658. doi: 10.1371/journal.pone.0164658. eCollection 2016.
2
On the complexity of neural network classifiers: a comparison between shallow and deep architectures.神经网络分类器的复杂性研究:浅层结构与深层结构的比较。
IEEE Trans Neural Netw Learn Syst. 2014 Aug;25(8):1553-65. doi: 10.1109/TNNLS.2013.2293637.
3
Quantifying the information in the long-range order of words: semantic structures and universal linguistic constraints.
量化词的长程顺序中的信息:语义结构与普遍语言限制
Cortex. 2014 Jun;55:5-16. doi: 10.1016/j.cortex.2013.08.008. Epub 2013 Aug 29.
4
On the origin of long-range correlations in texts.文本中长程相关性的起源。
Proc Natl Acad Sci U S A. 2012 Jul 17;109(29):11582-7. doi: 10.1073/pnas.1117723109. Epub 2012 Jul 2.
5
Zipf's law leads to Heaps' law: analyzing their relation in finite-size systems.齐夫定律导致海普斯定律:分析有限系统中的它们之间的关系。
PLoS One. 2010 Dec 2;5(12):e14139. doi: 10.1371/journal.pone.0014139.
6
Beyond word frequency: bursts, lulls, and scaling in the temporal distributions of words.超越词频:词的时间分布中的爆发、沉寂和标度。
PLoS One. 2009 Nov 11;4(11):e7678. doi: 10.1371/journal.pone.0007678.
7
Effect of nonlinear correlations on the statistics of return intervals in multifractal data sets.多重分形数据集中非线性相关性对返回间隔统计的影响。
Phys Rev Lett. 2007 Dec 14;99(24):240601. doi: 10.1103/PhysRevLett.99.240601. Epub 2007 Dec 10.
8
Renormalization-group transformations and correlations of seismicity.重正化群变换与地震活动性的相关性
Phys Rev Lett. 2005 Jul 8;95(2):028501. doi: 10.1103/PhysRevLett.95.028501. Epub 2005 Jul 7.
9
Scaling and memory in volatility return intervals in financial markets.金融市场波动回报区间的标度与记忆性
Proc Natl Acad Sci U S A. 2005 Jun 28;102(26):9424-8. doi: 10.1073/pnas.0502613102. Epub 2005 Jun 24.
10
Long-term memory: a natural mechanism for the clustering of extreme events and anomalous residual times in climate records.长期记忆:气候记录中极端事件和异常剩余时间聚类的一种自然机制。
Phys Rev Lett. 2005 Feb 4;94(4):048701. doi: 10.1103/PhysRevLett.94.048701. Epub 2005 Jan 31.