• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SiGMoiD:一种用于二值数据的超统计生成模型。

SiGMoiD: A super-statistical generative model for binary data.

机构信息

Department of Physics, University of Florida, Gainesville, Florida, United States of America.

Elanco Animal Health, Greenfield, Indiana, United States of America.

出版信息

PLoS Comput Biol. 2021 Aug 6;17(8):e1009275. doi: 10.1371/journal.pcbi.1009275. eCollection 2021 Aug.

DOI:10.1371/journal.pcbi.1009275
PMID:34358223
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8372922/
Abstract

In modern computational biology, there is great interest in building probabilistic models to describe collections of a large number of co-varying binary variables. However, current approaches to build generative models rely on modelers' identification of constraints and are computationally expensive to infer when the number of variables is large (N~100). Here, we address both these issues with Super-statistical Generative Model for binary Data (SiGMoiD). SiGMoiD is a maximum entropy-based framework where we imagine the data as arising from super-statistical system; individual binary variables in a given sample are coupled to the same 'bath' whose intensive variables vary from sample to sample. Importantly, unlike standard maximum entropy approaches where modeler specifies the constraints, the SiGMoiD algorithm infers them directly from the data. Due to this optimal choice of constraints, SiGMoiD allows us to model collections of a very large number (N>1000) of binary variables. Finally, SiGMoiD offers a reduced dimensional description of the data, allowing us to identify clusters of similar data points as well as binary variables. We illustrate the versatility of SiGMoiD using multiple datasets spanning several time- and length-scales.

摘要

在现代计算生物学中,人们对构建概率模型来描述大量共变二进制变量的集合非常感兴趣。然而,当前构建生成模型的方法依赖于建模者识别约束,并且当变量数量很大(N~100)时,推断起来计算成本很高。在这里,我们通过 Super-statistical Generative Model for binary Data (SiGMoiD) 解决了这两个问题。SiGMoiD 是一个基于最大熵的框架,我们将数据想象为来自超统计系统;给定样本中的单个二进制变量与相同的“浴盆”耦合,浴盆的强度变量在样本之间变化。重要的是,与建模者指定约束的标准最大熵方法不同,SiGMoiD 算法直接从数据中推断出它们。由于这种最优约束选择,SiGMoiD 允许我们对非常大量(N>1000)的二进制变量进行建模。最后,SiGMoiD 提供了数据的降维描述,使我们能够识别相似数据点和二进制变量的聚类。我们使用跨越多个时间和长度尺度的多个数据集来说明 SiGMoiD 的多功能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/47b03ac05ab6/pcbi.1009275.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/a11682af237d/pcbi.1009275.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/4972c0aef166/pcbi.1009275.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/64e2d7756bbd/pcbi.1009275.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/47b03ac05ab6/pcbi.1009275.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/a11682af237d/pcbi.1009275.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/4972c0aef166/pcbi.1009275.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/64e2d7756bbd/pcbi.1009275.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6f/8372922/47b03ac05ab6/pcbi.1009275.g004.jpg

相似文献

1
SiGMoiD: A super-statistical generative model for binary data.SiGMoiD:一种用于二值数据的超统计生成模型。
PLoS Comput Biol. 2021 Aug 6;17(8):e1009275. doi: 10.1371/journal.pcbi.1009275. eCollection 2021 Aug.
2
Constrained maximum entropy models to select genotype interactions associated with censored failure times.用于选择与删失失效时间相关的基因型相互作用的约束最大熵模型。
J Bioinform Comput Biol. 2018 Dec;16(6):1840024. doi: 10.1142/S0219720018400243. Epub 2018 Oct 30.
3
An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions.一种用于学习疾病风险的最大熵概率模型的算法,该算法能够有效地搜索和节省多基因座基因组相互作用的编码。
Bioinformatics. 2009 Oct 1;25(19):2478-85. doi: 10.1093/bioinformatics/btp435. Epub 2009 Jul 16.
4
Semisupervised learning for a hybrid generative/discriminative classifier based on the maximum entropy principle.基于最大熵原理的混合生成/判别式分类器的半监督学习
IEEE Trans Pattern Anal Mach Intell. 2008 Mar;30(3):424-37. doi: 10.1109/TPAMI.2007.70710.
5
Design of zero reference codes using cross-entropy method.使用交叉熵方法设计零参考码。
Opt Express. 2009 Nov 23;17(24):22163-70. doi: 10.1364/OE.17.022163.
6
Group-representative functional network estimation from multi-subject fMRI data via MRF-based image segmentation.基于马尔可夫随机场图像分割的多体素 fMRI 数据的群组代表性功能网络估计。
Comput Methods Programs Biomed. 2019 Oct;179:104976. doi: 10.1016/j.cmpb.2019.07.004. Epub 2019 Jul 19.
7
New probabilistic network models and algorithms for oncogenesis.肿瘤发生的新概率网络模型与算法
J Comput Biol. 2006 May;13(4):853-65. doi: 10.1089/cmb.2006.13.853.
8
Inferring neural circuit structure from datasets of heterogeneous tuning curves.从异质调谐曲线的数据集推断神经回路结构。
PLoS Comput Biol. 2019 Apr 19;15(4):e1006816. doi: 10.1371/journal.pcbi.1006816. eCollection 2019 Apr.
9
The estimation of distributions and the minimum relative entropy principle.分布估计与最小相对熵原理。
Evol Comput. 2005 Spring;13(1):1-27. doi: 10.1162/1063656053583469.
10
Generative embedding for model-based classification of fMRI data.基于生成式嵌入的 fMRI 数据模型分类。
PLoS Comput Biol. 2011 Jun;7(6):e1002079. doi: 10.1371/journal.pcbi.1002079. Epub 2011 Jun 23.

引用本文的文献

1
Bayesian inference on high-dimensional multivariate binary responses.高维多元二元响应的贝叶斯推断。
J Am Stat Assoc. 2024;119(548):2560-2571. doi: 10.1080/01621459.2023.2260053. Epub 2023 Nov 9.
2
Designing host-associated microbiomes using the consumer/resource model.使用消费者/资源模型设计宿主相关微生物群。
mSystems. 2025 Jan 21;10(1):e0106824. doi: 10.1128/msystems.01068-24. Epub 2024 Dec 9.
3
GENERALIST: A latent space based generative model for protein sequence families.通用:基于潜在空间的蛋白质序列家族生成模型。

本文引用的文献

1
Inferring a network from dynamical signals at its nodes.从节点的动态信号推断网络。
PLoS Comput Biol. 2020 Nov 30;16(11):e1008435. doi: 10.1371/journal.pcbi.1008435. eCollection 2020 Nov.
2
Macroecological laws describe variation and diversity in microbial communities.宏观生态学法则描述了微生物群落的变化和多样性。
Nat Commun. 2020 Sep 21;11(1):4743. doi: 10.1038/s41467-020-18529-y.
3
Context-aware dimensionality reduction deconvolutes gut microbial community dynamics.上下文感知降维可剖析肠道微生物群落动态。
PLoS Comput Biol. 2023 Nov 27;19(11):e1011655. doi: 10.1371/journal.pcbi.1011655. eCollection 2023 Nov.
4
A Formal Framework for Knowledge Acquisition: Going beyond Machine Learning.知识获取的形式化框架:超越机器学习
Entropy (Basel). 2022 Oct 14;24(10):1469. doi: 10.3390/e24101469.
5
Growth promotion and antibiotic induced metabolic shifts in the chicken gut microbiome.促进生长和抗生素诱导的鸡肠道微生物组代谢转变。
Commun Biol. 2022 Apr 1;5(1):293. doi: 10.1038/s42003-022-03239-6.
Nat Biotechnol. 2021 Feb;39(2):165-168. doi: 10.1038/s41587-020-0660-7. Epub 2020 Aug 31.
4
Macroecological dynamics of gut microbiota.肠道微生物组的宏生态学动态。
Nat Microbiol. 2020 May;5(5):768-775. doi: 10.1038/s41564-020-0685-1. Epub 2020 Apr 13.
5
Current explorations of nutrition and the gut microbiome: a comprehensive evaluation of the review literature.当前对营养与肠道微生物组的探索:对综述文献的全面评估。
Nutr Rev. 2020 Oct 1;78(10):798-812. doi: 10.1093/nutrit/nuz106.
6
Maximum Entropy Framework for Predictive Inference of Cell Population Heterogeneity and Responses in Signaling Networks.最大熵框架用于预测信号网络中细胞群体异质性和反应的预测推理。
Cell Syst. 2020 Feb 26;10(2):204-212.e8. doi: 10.1016/j.cels.2019.11.010. Epub 2019 Dec 18.
7
Spatial metagenomic characterization of microbial biogeography in the gut.肠道微生物地理学的空间宏基因组特征分析。
Nat Biotechnol. 2019 Aug;37(8):877-883. doi: 10.1038/s41587-019-0183-2. Epub 2019 Jul 22.
8
Towards the neural population doctrine.迈向神经群体学说。
Curr Opin Neurobiol. 2019 Apr;55:103-111. doi: 10.1016/j.conb.2019.02.002. Epub 2019 Mar 13.
9
A theoretical framework for controlling complex microbial communities.控制复杂微生物群落的理论框架。
Nat Commun. 2019 Mar 5;10(1):1045. doi: 10.1038/s41467-019-08890-y.
10
KBase: The United States Department of Energy Systems Biology Knowledgebase.KBase:美国能源部系统生物学知识库。
Nat Biotechnol. 2018 Jul 6;36(7):566-569. doi: 10.1038/nbt.4163.