Suppr超能文献

寻找科学主题。

Finding scientific topics.

作者信息

Griffiths Thomas L, Steyvers Mark

机构信息

Department of Psychology, Stanford University, Stanford, CA 94305, USA.

出版信息

Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5228-35. doi: 10.1073/pnas.0307752101. Epub 2004 Feb 10.

Abstract

A first step in identifying the content of a document is determining which topics that document addresses. We describe a generative model for documents, introduced by Blei, Ng, and Jordan [Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993-1022], in which each document is generated by choosing a distribution over topics and then choosing each word in the document from a topic selected according to this distribution. We then present a Markov chain Monte Carlo algorithm for inference in this model. We use this algorithm to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. We show that the extracted topics capture meaningful structure in the data, consistent with the class designations provided by the authors of the articles, and outline further applications of this analysis, including identifying "hot topics" by examining temporal dynamics and tagging abstracts to illustrate semantic content.

摘要

识别文档内容的第一步是确定该文档涉及哪些主题。我们描述了一种由Blei、Ng和Jordan [Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993 - 1022] 提出的文档生成模型,其中每个文档通过选择主题上的分布,然后根据此分布从所选主题中选择文档中的每个单词来生成。然后,我们提出一种马尔可夫链蒙特卡罗算法用于此模型的推理。我们使用该算法通过贝叶斯模型选择来确定主题数量,从而分析美国国家科学院院刊(PNAS)的摘要。我们表明,提取的主题捕捉到了数据中有意义的结构,与文章作者提供的类别指定一致,并概述了此分析的进一步应用,包括通过检查时间动态来识别“热门话题”以及为摘要添加标签以说明语义内容。

相似文献

1
Finding scientific topics.
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5228-35. doi: 10.1073/pnas.0307752101. Epub 2004 Feb 10.
2
Mapping topics and topic bursts in PNAS.
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5287-90. doi: 10.1073/pnas.0307626100. Epub 2004 Feb 20.
3
The simultaneous evolution of author and paper networks.
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5266-73. doi: 10.1073/pnas.0307625100. Epub 2004 Feb 19.
4
Mixed-membership models of scientific publications.
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5220-7. doi: 10.1073/pnas.0307760101. Epub 2004 Mar 12.
5
Mapping knowledge domains: characterizing PNAS.
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1(Suppl 1):5192-9. doi: 10.1073/pnas.0307509100. Epub 2004 Feb 12.
6
Scientific publishing. PNAS nixes special privileges for (most) papers.
Science. 2009 Sep 18;325(5947):1486-7. doi: 10.1126/science.325_1486b.
7
Link-topic model for biomedical abbreviation disambiguation.
J Biomed Inform. 2015 Feb;53:367-80. doi: 10.1016/j.jbi.2014.12.013. Epub 2014 Dec 30.
8
Spatiotemporal Bayesian inference dipole analysis for MEG neuroimaging data.
Neuroimage. 2005 Oct 15;28(1):84-98. doi: 10.1016/j.neuroimage.2005.06.003. Epub 2005 Jul 15.
10
Connecting the latent multinomial.
Biometrics. 2015 Dec;71(4):1070-80. doi: 10.1111/biom.12333. Epub 2015 Jun 1.

引用本文的文献

2
Microbes Under Climate Refugia: Equable Subcommunity Rank Dynamics in Large-River Deltaic Estuaries.
Ecol Evol. 2025 Aug 15;15(8):e72014. doi: 10.1002/ece3.72014. eCollection 2025 Aug.
5
A Japanese LDA model for automatic clustering analysis of semantic verbal fluency tests.
Behav Res Methods. 2025 Jun 30;57(8):209. doi: 10.3758/s13428-025-02696-1.
7
How online public opinion evolves before and after policy adjustments in response to major public health emergencies.
Front Public Health. 2025 Jun 9;13:1438854. doi: 10.3389/fpubh.2025.1438854. eCollection 2025.
9
30 years of climate related phenological research: themes and trends.
Int J Biometeorol. 2025 Jun;69(6):1459-1473. doi: 10.1007/s00484-025-02903-w. Epub 2025 May 12.
10
Concept of digital health literacy revisited: Using text network and topic model analysis.
Digit Health. 2025 May 5;11:20552076251334537. doi: 10.1177/20552076251334537. eCollection 2025 Jan-Dec.

本文引用的文献

1
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.
IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.
2
Fundamental theorem of natural selection under gene-culture transmission.
Proc Natl Acad Sci U S A. 1991 Jun 1;88(11):4874-6. doi: 10.1073/pnas.88.11.4874.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验