• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PDC——一种概率分布聚类算法:以PubMed中关于自杀的文章为例的研究

PDC - a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed.

作者信息

Islamaj Rezarta, Yeganova Lana, Kim Won, Xie Natalie, Wilbur W John, Lu Zhiyong

机构信息

National Library of Medicine, National Institutes of Health, Bethesda MD, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:259-268. eCollection 2020.

PMID:32477645
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7233029/
Abstract

The need to organize a large collection in a manner that facilitates human comprehension is crucial given the ever-increasing volumes of information. In this work, we present PDC (probabilistic distributional clustering), a novel algorithm that, given a document collection, computes disjoint term sets representing topics in the collection. The algorithm relies on probabilities of word co-occurrences to partition the set of terms appearing in the collection of documents into disjoint groups of related terms. In this work, we also present an environment to visualize the computed topics in the term space and retrieve the most related PubMed articles for each group of terms. We illustrate the algorithm by applying it to PubMed documents on the topic of suicide. Suicide is a major public health problem identified as the tenth leading cause of death in the US. In this application, our goal is to provide a global view of the mental health literature pertaining to the subject of suicide, and through this, to help create a rich environment of multifaceted data to guide health care researchers in their endeavor to better understand the breadth, depth and scope of the problem. We demonstrate the usefulness of the proposed algorithm by providing a web portal that allows mental health researchers to peruse the suicide-related literature in PubMed.

摘要

鉴于信息量不断增加,以促进人类理解的方式组织大量信息的需求至关重要。在这项工作中,我们提出了概率分布聚类(PDC)算法,这是一种新颖的算法,它在给定文档集的情况下,计算代表该文档集中主题的不相交词集。该算法依靠词共现概率将文档集中出现的词集划分为不相交的相关词组。在这项工作中,我们还展示了一个环境,用于在词空间中可视化计算出的主题,并为每组词检索最相关的PubMed文章。我们通过将该算法应用于关于自杀主题的PubMed文档来说明该算法。自杀是一个重大的公共卫生问题,在美国被确定为第十大死因。在这个应用中,我们的目标是提供与自杀主题相关的心理健康文献的全局视图,并借此帮助创建一个多方面数据丰富的环境,以指导医疗保健研究人员更好地理解该问题的广度、深度和范围。我们通过提供一个允许心理健康研究人员查阅PubMed中与自杀相关文献的网络门户,展示了所提出算法的实用性。

相似文献

1
PDC - a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed.PDC——一种概率分布聚类算法:以PubMed中关于自杀的文章为例的研究
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:259-268. eCollection 2020.
2
Discovering themes in biomedical literature using a projection-based algorithm.基于投影算法的生物医学文献主题发现
BMC Bioinformatics. 2018 Jul 16;19(1):269. doi: 10.1186/s12859-018-2240-0.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
A framework and methodology for navigating disaster and global health in crisis literature.危机文献中应对灾难与全球健康的框架及方法
PLoS Curr. 2013 Apr 4;5:ecurrents.dis.9af6948e381dafdd3e877c441527cba0. doi: 10.1371/currents.dis.9af6948e381dafdd3e877c441527cba0.
5
A thematic analysis of the AIDS literature.艾滋病文献的主题分析。
Pac Symp Biocomput. 2002:386-97.
6
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
7
Retro: concept-based clustering of biomedical topical sets.回溯:基于概念的生物医学主题集聚类。
Bioinformatics. 2014 Nov 15;30(22):3240-8. doi: 10.1093/bioinformatics/btu514. Epub 2014 Jul 29.
8
Thematic clustering of text documents using an EM-based approach.使用基于期望最大化(EM)的方法对文本文档进行主题聚类。
J Biomed Semantics. 2012 Oct 5;3 Suppl 3(Suppl 3):S6. doi: 10.1186/2041-1480-3-S3-S6.
9
[The hierarchical clustering analysis of hyperspectral image based on probabilistic latent semantic analysis].基于概率潜在语义分析的高光谱图像层次聚类分析
Guang Pu Xue Yu Guang Pu Fen Xi. 2011 Sep;31(9):2471-5.
10
Exploration of a collection of documents in neuroscience and extraction of topics by clustering.探索神经科学领域的一系列文献并通过聚类提取主题。
Neural Netw. 2008 Oct;21(8):1205-11. doi: 10.1016/j.neunet.2008.05.009. Epub 2008 Jun 7.

引用本文的文献

1
Unsupervised learning and natural language processing highlight research trends in a superbug.无监督学习和自然语言处理突出了一种超级细菌的研究趋势。
Front Artif Intell. 2024 Mar 21;7:1336071. doi: 10.3389/frai.2024.1336071. eCollection 2024.
2
Comprehensively identifying Long Covid articles with human-in-the-loop machine learning.通过人工参与的机器学习全面识别长新冠相关文章。
Patterns (N Y). 2023 Jan 13;4(1):100659. doi: 10.1016/j.patter.2022.100659. Epub 2022 Dec 1.

本文引用的文献

1
Topic-Based Exploration and Embedded Visualizations for Research Idea Generation.基于主题的探索和嵌入式可视化技术在研究创意生成中的应用。
IEEE Trans Vis Comput Graph. 2020 Mar;26(3):1592-1607. doi: 10.1109/TVCG.2018.2873011. Epub 2018 Oct 1.
2
Creation of Individual Scientific Concept-Centered Semantic Maps Based on Automated Text-Mining Analysis of PubMed.基于PubMed自动文本挖掘分析创建以个体科学概念为中心的语义图
Adv Bioinformatics. 2018 Jul 26;2018:4625394. doi: 10.1155/2018/4625394. eCollection 2018.
3
Discovering themes in biomedical literature using a projection-based algorithm.基于投影算法的生物医学文献主题发现
BMC Bioinformatics. 2018 Jul 16;19(1):269. doi: 10.1186/s12859-018-2240-0.
4
Research Trend Visualization by MeSH Terms from PubMed.基于 PubMed 的 MeSH 术语的研究趋势可视化。
Int J Environ Res Public Health. 2018 May 30;15(6):1113. doi: 10.3390/ijerph15061113.
5
Discovering Shifts to Suicidal Ideation from Mental Health Content in Social Media.从社交媒体上的心理健康内容中发现向自杀意念的转变。
Proc SIGCHI Conf Hum Factor Comput Syst. 2016 May;2016:2098-2110. doi: 10.1145/2858036.2858207.
6
Evaluation of research topic evolution in psychiatry using co-word analysis.运用共词分析评估精神病学领域的研究主题演变
Medicine (Baltimore). 2017 Jun;96(25):e7349. doi: 10.1097/MD.0000000000007349.
7
Meshable: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms.可网格化:利用医学主题词表(MeSH)及其衍生主题词搜索PubMed摘要。
Bioinformatics. 2016 Oct 1;32(19):3044-6. doi: 10.1093/bioinformatics/btw331. Epub 2016 Jun 10.
8
Text mining for identifying topics in the literatures about adolescent substance use and depression.用于识别青少年物质使用与抑郁相关文献中主题的文本挖掘
BMC Public Health. 2016 Mar 19;16:279. doi: 10.1186/s12889-016-2932-1.
9
GoPubMed: exploring PubMed with the Gene Ontology.GoPubMed:利用基因本体论探索PubMed
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W783-6. doi: 10.1093/nar/gki470.
10
MeSHmap: a text mining tool for MEDLINE.医学主题词表映射:一种用于医学文献数据库的文本挖掘工具。
Proc AMIA Symp. 2001:642-6.