• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过概率主题建模利用基因组数据的功能和分类结构。

Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling.

机构信息

College of Information Science & Technology, Drexel University, Philadelphia, PA 19104, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):980-91. doi: 10.1109/TCBB.2011.113.

DOI:10.1109/TCBB.2011.113
PMID:21844637
Abstract

In this paper, we present a method that enable both homology-based approach and composition-based approach to further study the functional core (i.e., microbial core and gene core, correspondingly). In the proposed method, the identification of major functionality groups is achieved by generative topic modeling, which is able to extract useful information from unlabeled data. We first show that generative topic model can be used to model the taxon abundance information obtained by homology-based approach and study the microbial core. The model considers each sample as a “document,” which has a mixture of functional groups, while each functional group (also known as a “latent topic”) is a weight mixture of species. Therefore, estimating the generative topic model for taxon abundance data will uncover the distribution over latent functions (latent topic) in each sample. Second, we show that, generative topic model can also be used to study the genome-level composition of “N-mer” features (DNA subreads obtained by composition-based approaches). The model consider each genome as a mixture of latten genetic patterns (latent topics), while each functional pattern is a weighted mixture of the “N-mer” features, thus the existence of core genomes can be indicated by a set of common N-mer features. After studying the mutual information between latent topics and gene regions, we provide an explanation of the functional roles of uncovered latten genetic patterns. The experimental results demonstrate the effectiveness of proposed method.

摘要

在本文中,我们提出了一种方法,使基于同源性的方法和基于组合的方法能够进一步研究功能核心(即微生物核心和基因核心,相应地)。在所提出的方法中,通过生成式主题建模来实现主要功能组的识别,这能够从未标记的数据中提取有用的信息。我们首先表明,生成式主题模型可用于对基于同源性的方法获得的分类群丰度信息进行建模,并研究微生物核心。该模型将每个样本视为具有功能组混合物的“文档”,而每个功能组(也称为“潜在主题”)是物种的权重混合物。因此,对分类群丰度数据的生成式主题模型进行估计将揭示每个样本中潜在功能(潜在主题)的分布。其次,我们表明,生成式主题模型也可用于研究“N-mer”特征的基因组水平组成(通过基于组合的方法获得的 DNA 子读取)。该模型将每个基因组视为潜在遗传模式(潜在主题)的混合物,而每个功能模式是“N-mer”特征的加权混合物,因此核心基因组的存在可以通过一组共同的 N- mer 特征来指示。在研究潜在主题和基因区域之间的互信息之后,我们对发现的潜在遗传模式的功能作用提供了一个解释。实验结果证明了所提出方法的有效性。

相似文献

1
Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling.通过概率主题建模利用基因组数据的功能和分类结构。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):980-91. doi: 10.1109/TCBB.2011.113.
2
Estimating functional groups in human gut microbiome with probabilistic topic models.用概率主题模型估计人类肠道微生物组中的功能群。
IEEE Trans Nanobioscience. 2012 Sep;11(3):203-15. doi: 10.1109/TNB.2012.2212204.
3
Whole-Genome -mer Topic Modeling AssociatesBacterial Families.全基因组 -mer 主题建模与细菌家族相关。
Genes (Basel). 2020 Feb 14;11(2):197. doi: 10.3390/genes11020197.
4
Investigating topic models' capabilities in expression microarray data classification.探讨主题模型在表达微阵列数据分类中的能力。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1831-6. doi: 10.1109/TCBB.2012.121.
5
Probabilistic topic modeling for the analysis and classification of genomic sequences.用于基因组序列分析和分类的概率主题建模
BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S2. doi: 10.1186/1471-2105-16-S6-S2. Epub 2015 Apr 17.
6
Infer Metagenomic Abundance and Reveal Homologous Genomes Based on the Structure of Taxonomy Tree.基于分类树结构推断宏基因组丰度并揭示同源基因组
IEEE/ACM Trans Comput Biol Bioinform. 2015 Sep-Oct;12(5):1112-22. doi: 10.1109/TCBB.2015.2415814.
7
Incorporating comorbidities into latent treatment pattern mining for clinical pathways.将合并症纳入临床路径的潜在治疗模式挖掘中。
J Biomed Inform. 2016 Feb;59:227-39. doi: 10.1016/j.jbi.2015.12.012. Epub 2015 Dec 21.
8
Bioinformatic progress and applications in metaproteogenomics for bridging the gap between genomic sequences and metabolic functions in microbial communities.生物信息学在宏蛋白质组学中的进展和应用,有助于弥合微生物群落中基因组序列和代谢功能之间的差距。
Proteomics. 2013 Oct;13(18-19):2786-804. doi: 10.1002/pmic.201200566. Epub 2013 Aug 7.
9
Computational integration of genomic traits into 16S rDNA microbiota sequencing studies.将基因组特征进行计算整合到 16S rDNA 微生物组测序研究中。
Gene. 2014 Oct 1;549(1):186-91. doi: 10.1016/j.gene.2014.07.066. Epub 2014 Jul 30.
10
Dirichlet multinomial mixtures: generative models for microbial metagenomics.狄利克雷多项混合模型:微生物宏基因组学的生成模型。
PLoS One. 2012;7(2):e30126. doi: 10.1371/journal.pone.0030126. Epub 2012 Feb 3.

引用本文的文献

1
Topic modeling revisited:  New evidence on algorithm performance and quality metrics.主题建模再探讨:算法性能和质量指标的新证据。
PLoS One. 2022 Apr 28;17(4):e0266325. doi: 10.1371/journal.pone.0266325. eCollection 2022.
2
Whole-Genome -mer Topic Modeling AssociatesBacterial Families.全基因组 -mer 主题建模与细菌家族相关。
Genes (Basel). 2020 Feb 14;11(2):197. doi: 10.3390/genes11020197.
3
An overview of topic modeling and its current applications in bioinformatics.主题建模概述及其在生物信息学中的当前应用。
Springerplus. 2016 Sep 20;5(1):1608. doi: 10.1186/s40064-016-3252-8. eCollection 2016.
4
Exploiting topic modeling to boost metagenomic reads binning.利用主题建模来促进宏基因组读数分箱。
BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-16-S5-S2. Epub 2015 Mar 18.