利用潜在主题从文献中预测蛋白质-蛋白质关系。

Predicting protein-protein relationships from literature using latent topics.

作者信息

Aso Tatsuya, Eguchi Koji

机构信息

Department of Computer Science and Systems Engineering, Kobe University, 1-1 Rokkoudai, Nada-ku, Kobe 657-8501, Japan.

出版信息

Genome Inform. 2009 Oct;23(1):3-12.

PMID:20180257

Abstract

This paper investigates applying statistical topic models to extract and predict relationships between biological entities, especially protein mentions. A statistical topic model, Latent Dirichlet Allocation (LDA) is promising; however, it has not been investigated for such a task. In this paper, we apply the state-of-the-art Collapsed Variational Bayesian Inference and Gibbs Sampling inference to estimating the LDA model. We also apply probabilistic Latent Semantic Analysis (pLSA) as a baseline for comparison, and compare them from the viewpoints of log-likelihood, classification accuracy and retrieval effectiveness. We demonstrate through experiments that the Collapsed Variational LDA gives better results than the others, especially in terms of classification accuracy and retrieval effectiveness in the task of the protein-protein relationship prediction.

摘要

本文研究了应用统计主题模型来提取和预测生物实体之间的关系，特别是蛋白质提及之间的关系。统计主题模型——潜在狄利克雷分配（LDA）很有前景；然而，尚未针对此类任务对其进行研究。在本文中，我们应用最先进的塌缩变分贝叶斯推理和吉布斯采样推理来估计LDA模型。我们还应用概率潜在语义分析（pLSA）作为比较的基线，并从对数似然、分类准确率和检索效率的角度对它们进行比较。我们通过实验证明，塌缩变分LDA比其他方法能给出更好的结果，特别是在蛋白质-蛋白质关系预测任务的分类准确率和检索效率方面。

相似文献

Predicting protein-protein relationships from literature using latent topics.利用潜在主题从文献中预测蛋白质-蛋白质关系。

Genome Inform. 2009 Oct;23(1):3-12.

Latent-space variational bayes.潜在空间变分贝叶斯

IEEE Trans Pattern Anal Mach Intell. 2008 Dec;30(12):2236-42. doi: 10.1109/TPAMI.2008.157.

Learning topic models by belief propagation.通过信念传播学习主题模型。

IEEE Trans Pattern Anal Mach Intell. 2013 May;35(5):1121-34. doi: 10.1109/TPAMI.2012.185.

DANGLE: A Bayesian inferential method for predicting protein backbone dihedral angles and secondary structure.DANGLE：一种用于预测蛋白质主链二面角和二级结构的贝叶斯推断方法。

J Magn Reson. 2010 Feb;202(2):223-33. doi: 10.1016/j.jmr.2009.11.008. Epub 2009 Dec 16.

Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior.蛋白质序列位点特异性速率推断方法的比较：经验贝叶斯方法更具优势。

Mol Biol Evol. 2004 Sep;21(9):1781-91. doi: 10.1093/molbev/msh194. Epub 2004 Jun 16.

Estimating trees from filtered data: identifiability of models for morphological phylogenetics.从过滤数据中估算树木：形态系统发生学模型的可识别性。

J Theor Biol. 2010 Mar 7;263(1):108-19. doi: 10.1016/j.jtbi.2009.12.001. Epub 2009 Dec 11.

A Segmental Semi Markov Model for protein secondary structure prediction.一种蛋白质二级结构预测的分段半马尔可夫模型。

Math Biosci. 2009 Oct;221(2):130-5. doi: 10.1016/j.mbs.2009.07.004. Epub 2009 Jul 29.

Hierarchical bayesian modeling of topics in time-stamped documents.基于时间戳文档的主题分层贝叶斯建模。

IEEE Trans Pattern Anal Mach Intell. 2010 Jun;32(6):996-1011. doi: 10.1109/TPAMI.2009.125.

Bayesian mixture modeling using a hybrid sampler with application to protein subfamily identification.贝叶斯混合建模使用混合采样器及其在蛋白质亚家族鉴定中的应用。

Biostatistics. 2010 Jan;11(1):18-33. doi: 10.1093/biostatistics/kxp033. Epub 2009 Aug 20.

Bayesian methods for predicting interacting protein pairs using domain information.利用结构域信息预测相互作用蛋白对的贝叶斯方法。

Biometrics. 2007 Sep;63(3):824-33. doi: 10.1111/j.1541-0420.2007.00755.x.

引用本文的文献

Evaluation of clustering and topic modeling methods over health-related tweets and emails.健康相关推文和电子邮件的聚类和主题建模方法评估。

Artif Intell Med. 2021 Jul;117:102096. doi: 10.1016/j.artmed.2021.102096. Epub 2021 May 7.

Exploiting topic modeling to boost metagenomic reads binning.利用主题建模来促进宏基因组读数分箱。

BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-16-S5-S2. Epub 2015 Mar 18.

Inferring functional modules of protein families with probabilistic topic models.用概率主题模型推断蛋白质家族的功能模块。

BMC Bioinformatics. 2011 May 9;12:141. doi: 10.1186/1471-2105-12-141.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用潜在主题从文献中预测蛋白质-蛋白质关系。

Predicting protein-protein relationships from literature using latent topics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献