Suppr超能文献

利用潜在主题从文献中预测蛋白质-蛋白质关系。

Predicting protein-protein relationships from literature using latent topics.

作者信息

Aso Tatsuya, Eguchi Koji

机构信息

Department of Computer Science and Systems Engineering, Kobe University, 1-1 Rokkoudai, Nada-ku, Kobe 657-8501, Japan.

出版信息

Genome Inform. 2009 Oct;23(1):3-12.

Abstract

This paper investigates applying statistical topic models to extract and predict relationships between biological entities, especially protein mentions. A statistical topic model, Latent Dirichlet Allocation (LDA) is promising; however, it has not been investigated for such a task. In this paper, we apply the state-of-the-art Collapsed Variational Bayesian Inference and Gibbs Sampling inference to estimating the LDA model. We also apply probabilistic Latent Semantic Analysis (pLSA) as a baseline for comparison, and compare them from the viewpoints of log-likelihood, classification accuracy and retrieval effectiveness. We demonstrate through experiments that the Collapsed Variational LDA gives better results than the others, especially in terms of classification accuracy and retrieval effectiveness in the task of the protein-protein relationship prediction.

摘要

本文研究了应用统计主题模型来提取和预测生物实体之间的关系,特别是蛋白质提及之间的关系。统计主题模型——潜在狄利克雷分配(LDA)很有前景;然而,尚未针对此类任务对其进行研究。在本文中,我们应用最先进的塌缩变分贝叶斯推理和吉布斯采样推理来估计LDA模型。我们还应用概率潜在语义分析(pLSA)作为比较的基线,并从对数似然、分类准确率和检索效率的角度对它们进行比较。我们通过实验证明,塌缩变分LDA比其他方法能给出更好的结果,特别是在蛋白质-蛋白质关系预测任务的分类准确率和检索效率方面。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验