• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用最小 AIC 或 BIC 森林选择高维混合图形模型。

Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests.

机构信息

Institute of Genetics and Biotechnology, Faculty of Agricultural Sciences, Aarhus University, Aarhus, Denmark.

出版信息

BMC Bioinformatics. 2010 Jan 11;11:18. doi: 10.1186/1471-2105-11-18.

DOI:10.1186/1471-2105-11-18
PMID:20064242
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2823705/
Abstract

BACKGROUND

Chow and Liu showed that the maximum likelihood tree for multivariate discrete distributions may be found using a maximum weight spanning tree algorithm, for example Kruskal's algorithm. The efficiency of the algorithm makes it tractable for high-dimensional problems.

RESULTS

We extend Chow and Liu's approach in two ways: first, to find the forest optimizing a penalized likelihood criterion, for example AIC or BIC, and second, to handle data with both discrete and Gaussian variables. We apply the approach to three datasets: two from gene expression studies and the third from a genetics of gene expression study. The minimal BIC forest supplements a conventional analysis of differential expression by providing a tentative network for the differentially expressed genes. In the genetics of gene expression context the method identifies a network approximating the joint distribution of the DNA markers and the gene expression levels.

CONCLUSIONS

The approach is generally useful as a preliminary step towards understanding the overall dependence structure of high-dimensional discrete and/or continuous data. Trees and forests are unrealistically simple models for biological systems, but can provide useful insights. Uses include the following: identification of distinct connected components, which can be analysed separately (dimension reduction); identification of neighbourhoods for more detailed analyses; as initial models for search algorithms with a larger search space, for example decomposable models or Bayesian networks; and identification of interesting features, such as hub nodes.

摘要

背景

Chow 和 Liu 表明,对于多元离散分布,可以使用最大权重生成树算法(例如 Kruskal 算法)找到最大似然树。该算法的效率使其适用于高维问题。

结果

我们以两种方式扩展了 Chow 和 Liu 的方法:首先,找到优化惩罚似然准则(例如 AIC 或 BIC)的森林,其次,处理同时具有离散和高斯变量的数据。我们将该方法应用于三个数据集:两个来自基因表达研究,第三个来自基因表达遗传学研究。最小 BIC 森林通过为差异表达基因提供一个暂定网络,补充了传统的差异表达分析。在基因表达遗传学背景下,该方法识别出一个近似于 DNA 标记和基因表达水平联合分布的网络。

结论

该方法通常可作为理解高维离散和/或连续数据整体依赖结构的初步步骤。树和森林对于生物系统来说是不切实际的简单模型,但可以提供有用的见解。用途包括以下几个方面:识别不同的连接组件,可以分别进行分析(降维);识别更详细分析的邻域;作为更大搜索空间的搜索算法的初始模型,例如可分解模型或贝叶斯网络;以及识别有趣的特征,如枢纽节点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/43c2c7483034/1471-2105-11-18-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/40fa3c58e743/1471-2105-11-18-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/bfd38019e3e0/1471-2105-11-18-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/86897552e68b/1471-2105-11-18-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/49c00b1f910f/1471-2105-11-18-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/dfa56c180b05/1471-2105-11-18-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/8dcf77742ca3/1471-2105-11-18-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/43c2c7483034/1471-2105-11-18-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/40fa3c58e743/1471-2105-11-18-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/bfd38019e3e0/1471-2105-11-18-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/86897552e68b/1471-2105-11-18-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/49c00b1f910f/1471-2105-11-18-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/dfa56c180b05/1471-2105-11-18-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/8dcf77742ca3/1471-2105-11-18-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a14/2823705/43c2c7483034/1471-2105-11-18-7.jpg

相似文献

1
Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests.使用最小 AIC 或 BIC 森林选择高维混合图形模型。
BMC Bioinformatics. 2010 Jan 11;11:18. doi: 10.1186/1471-2105-11-18.
2
A copula method for modeling directional dependence of genes.一种用于建模基因方向依赖性的共现方法。
BMC Bioinformatics. 2008 May 1;9:225. doi: 10.1186/1471-2105-9-225.
3
Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO.利用差异加权图形套索法,将先验生物学知识纳入基于网络的差异基因表达分析。
BMC Bioinformatics. 2017 Feb 10;18(1):99. doi: 10.1186/s12859-017-1515-1.
4
Inferring gene networks from discrete expression data.从离散表达数据中推断基因网络。
Biostatistics. 2013 Sep;14(4):708-22. doi: 10.1093/biostatistics/kxt021. Epub 2013 Jul 18.
5
Information enhanced model selection for Gaussian graphical model with application to metabolomic data.信息增强的高斯图模型选择方法及其在代谢组学数据中的应用。
Biostatistics. 2022 Jul 18;23(3):926-948. doi: 10.1093/biostatistics/kxab006.
6
A GMM-IG framework for selecting genes as expression panel biomarkers.一种用于选择基因作为表达谱生物标志物的 GMM-IG 框架。
Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.
7
Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data.基于微阵列数据的大型基因网络估计的图形高斯建模中的加权套索法
Genome Inform. 2007;19:142-53.
8
A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data.一种用于对来自独立高斯分布和贝塔分布数据的基因进行聚类的联合有限混合模型。
BMC Bioinformatics. 2009 May 29;10:165. doi: 10.1186/1471-2105-10-165.
9
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.一种使用贝叶斯快速傅里叶变换对蛋白质组学数据进行聚类的新方法。
Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15.
10
Estimation of sparse directed acyclic graphs for multivariate counts data.多元计数数据的稀疏有向无环图估计
Biometrics. 2016 Sep;72(3):791-803. doi: 10.1111/biom.12467. Epub 2016 Feb 5.

引用本文的文献

1
Associations Between Postoperative Symptom Clusters and Functional Status in Lung Cancer Patients: A Cross-Sectional Study.肺癌患者术后症状群与功能状态的关联:一项横断面研究
Cancer Manag Res. 2025 Jun 12;17:1099-1111. doi: 10.2147/CMAR.S507420. eCollection 2025.
2
Spectral Clustering, Bayesian Spanning Forest, and Forest Process.谱聚类、贝叶斯生成森林和森林过程。
J Am Stat Assoc. 2024;119(547):2140-2153. doi: 10.1080/01621459.2023.2250098. Epub 2023 Sep 29.
3
Learning massive interpretable gene regulatory networks of the human brain by merging Bayesian networks.

本文引用的文献

1
Reverse engineering molecular regulatory networks from microarray data with qp-graphs.使用qp图从微阵列数据逆向工程分子调控网络。
J Comput Biol. 2009 Feb;16(2):213-27. doi: 10.1089/cmb.2008.08TT.
2
Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli.大肠杆菌中Lrp调控网络的全基因组规模重建
Proc Natl Acad Sci U S A. 2008 Dec 9;105(49):19462-7. doi: 10.1073/pnas.0807227105. Epub 2008 Dec 3.
3
A review on models and algorithms for motif discovery in protein-protein interaction networks.蛋白质-蛋白质相互作用网络中基序发现的模型与算法综述。
通过合并贝叶斯网络来学习人类大脑的大规模可解释基因调控网络。
PLoS Comput Biol. 2023 Dec 1;19(12):e1011443. doi: 10.1371/journal.pcbi.1011443. eCollection 2023 Dec.
4
Balanced Functional Module Detection in genomic data.基因组数据中的平衡功能模块检测
Bioinform Adv. 2021 Sep 16;1(1):vbab018. doi: 10.1093/bioadv/vbab018. eCollection 2021.
5
Information enhanced model selection for Gaussian graphical model with application to metabolomic data.信息增强的高斯图模型选择方法及其在代谢组学数据中的应用。
Biostatistics. 2022 Jul 18;23(3):926-948. doi: 10.1093/biostatistics/kxab006.
6
Integration of Metabolomic and Other Omics Data in Population-Based Study Designs: An Epidemiological Perspective.基于人群的研究设计中代谢组学与其他组学数据的整合:流行病学视角
Metabolites. 2019 Jun 18;9(6):117. doi: 10.3390/metabo9060117.
7
Brain Connectivity and Information-Flow Breakdown Revealed by a Minimum Spanning Tree-Based Analysis of MRI Data in Behavioral Variant Frontotemporal Dementia.基于最小生成树的行为变异型额颞叶痴呆MRI数据分析揭示的脑连接性和信息流中断
Front Neurosci. 2019 Mar 14;13:211. doi: 10.3389/fnins.2019.00211. eCollection 2019.
8
Sensitivity and specificity of information criteria.信息准则的灵敏度和特异性。
Brief Bioinform. 2020 Mar 23;21(2):553-565. doi: 10.1093/bib/bbz016.
9
What Is the Influence of Morphological Knowledge in the Early Stages of Reading Acquisition Among Low SES Children? A Graphical Modeling Approach.低社会经济地位儿童早期阅读习得阶段的形态学知识有何影响?一种图形建模方法。
Front Psychol. 2018 Apr 19;9:547. doi: 10.3389/fpsyg.2018.00547. eCollection 2018.
10
Acquisition and persistence of strain-specific methicillin-resistant Staphylococcus aureus and their determinants in community nursing homes.社区养老院中菌株特异性耐甲氧西林金黄色葡萄球菌的获得与持续存在及其决定因素
BMC Infect Dis. 2017 Dec 6;17(1):752. doi: 10.1186/s12879-017-2837-3.
Brief Funct Genomic Proteomic. 2008 Mar;7(2):147-56. doi: 10.1093/bfgp/eln015. Epub 2008 Apr 28.
4
Sparse inverse covariance estimation with the graphical lasso.使用图模型选择法进行稀疏逆协方差估计。
Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.
5
ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.ARACNE:一种用于在哺乳动物细胞环境中重建基因调控网络的算法。
BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-7-S1-S7.
6
Linear models and empirical bayes methods for assessing differential expression in microarray experiments.用于评估微阵列实验中差异表达的线性模型和经验贝叶斯方法。
Stat Appl Genet Mol Biol. 2004;3:Article3. doi: 10.2202/1544-6115.1027. Epub 2004 Feb 12.
7
An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival.一种用于预测人类乳腺癌中p53状态的表达特征可预测突变状态、转录效应及患者生存率。
Proc Natl Acad Sci U S A. 2005 Sep 20;102(38):13550-5. doi: 10.1073/pnas.0506230102. Epub 2005 Sep 2.
8
Graphical modeling of the joint distribution of alleles at associated loci.相关基因座上等位基因联合分布的图形建模。
Am J Hum Genet. 2004 Jun;74(6):1088-101. doi: 10.1086/421249. Epub 2004 Apr 26.
9
Inferring cellular networks using probabilistic graphical models.使用概率图模型推断细胞网络。
Science. 2004 Feb 6;303(5659):799-805. doi: 10.1126/science.1094068.
10
Network motifs: simple building blocks of complex networks.网络基序:复杂网络的简单构建模块。
Science. 2002 Oct 25;298(5594):824-7. doi: 10.1126/science.298.5594.824.