使用低阶偏相关进行生物网络推断。

Biological network inference using low order partial correlation.

作者信息

Zuo Yiming, Yu Guoqiang, Tadesse Mahlet G, Ressom Habtom W

机构信息

Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA.

Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA.

出版信息

Methods. 2014 Oct 1;69(3):266-73. doi: 10.1016/j.ymeth.2014.06.010. Epub 2014 Jul 5.

DOI:10.1016/j.ymeth.2014.06.010

PMID:25003577

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4194134/

Abstract

Biological network inference is a major challenge in systems biology. Traditional correlation-based network analysis results in too many spurious edges since correlation cannot distinguish between direct and indirect associations. To address this issue, Gaussian graphical models (GGM) were proposed and have been widely used. Though they can significantly reduce the number of spurious edges, GGM are insufficient to uncover a network structure faithfully due to the fact that they only consider the full order partial correlation. Moreover, when the number of samples is smaller than the number of variables, further technique based on sparse regularization needs to be incorporated into GGM to solve the singular covariance inversion problem. In this paper, we propose an efficient and mathematically solid algorithm that infers biological networks by computing low order partial correlation (LOPC) up to the second order. The bias introduced by the low order constraint is minimal compared to the more reliable approximation of the network structure achieved. In addition, the algorithm is suitable for a dataset with small sample size but large number of variables. Simulation results show that LOPC yields far less spurious edges and works well under various conditions commonly seen in practice. The application to a real metabolomics dataset further validates the performance of LOPC and suggests its potential power in detecting novel biomarkers for complex disease.

摘要

生物网络推断是系统生物学中的一项重大挑战。传统的基于相关性的网络分析会产生过多的虚假边，因为相关性无法区分直接关联和间接关联。为了解决这个问题，高斯图形模型（GGM）被提出并得到了广泛应用。尽管它们可以显著减少虚假边的数量，但由于仅考虑全阶偏相关性，GGM不足以如实地揭示网络结构。此外，当样本数量小于变量数量时，需要将基于稀疏正则化的进一步技术纳入GGM来解决奇异协方差逆问题。在本文中，我们提出了一种高效且数学上可靠的算法，该算法通过计算高达二阶的低阶偏相关性（LOPC）来推断生物网络。与所实现的更可靠的网络结构近似相比，低阶约束引入的偏差最小。此外，该算法适用于样本量小但变量数量多的数据集。模拟结果表明，LOPC产生的虚假边要少得多，并且在实际中常见的各种条件下都能很好地工作。将其应用于真实的代谢组学数据集进一步验证了LOPC的性能，并表明其在检测复杂疾病新生物标志物方面的潜在能力。

相似文献

Biological network inference using low order partial correlation.使用低阶偏相关进行生物网络推断。

Methods. 2014 Oct 1;69(3):266-73. doi: 10.1016/j.ymeth.2014.06.010. Epub 2014 Jul 5.

Regularized estimation of large-scale gene association networks using graphical Gaussian models.基于图式高斯模型的大规模基因关联网络正则化估计

BMC Bioinformatics. 2009 Nov 24;10:384. doi: 10.1186/1471-2105-10-384.

The 'un-shrunk' partial correlation in Gaussian graphical models.高斯图模型中的“非收缩”部分相关。

BMC Bioinformatics. 2021 Sep 7;22(1):424. doi: 10.1186/s12859-021-04313-2.

FastGGM: An Efficient Algorithm for the Inference of Gaussian Graphical Model in Biological Networks.FastGGM：一种用于生物网络中高斯图形模型推断的高效算法。

PLoS Comput Biol. 2016 Feb 12;12(2):e1004755. doi: 10.1371/journal.pcbi.1004755. eCollection 2016 Feb.

Exact hypothesis testing for shrinkage-based Gaussian graphical models.基于收缩的高斯图模型的精确假设检验。

Bioinformatics. 2019 Dec 1;35(23):5011-5017. doi: 10.1093/bioinformatics/btz357.

MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT：一种使用时间序列基因表达数据推断基因调控网络的新算法。

BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.

A Statistical Test for Differential Network Analysis Based on Inference of Gaussian Graphical Model.基于高斯图模型推断的差异网络分析的统计检验

Sci Rep. 2019 Jul 26;9(1):10863. doi: 10.1038/s41598-019-47362-7.

A Multiattribute Gaussian Graphical Model for Inferring Multiscale Regulatory Networks: An Application in Breast Cancer.一种用于推断多尺度调控网络的多属性高斯图形模型：在乳腺癌中的应用

Methods Mol Biol. 2019;1883:143-160. doi: 10.1007/978-1-4939-8882-2_6.

Inferring large-scale gene regulatory networks using a low-order constraint-based algorithm.使用基于低阶约束的算法推断大规模基因调控网络。

Mol Biosyst. 2010 Jun;6(6):988-98. doi: 10.1039/b917571g. Epub 2010 Feb 19.

Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size.针对样本量有限的大型共表达网络的调控推理评估与改进

BMC Syst Biol. 2017 Jun 19;11(1):62. doi: 10.1186/s12918-017-0440-2.

引用本文的文献

Information-incorporated gene network construction with FDR control.基于 FDR 控制的包含信息的基因网络构建

Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae125.

Gene network inference from single-cell omics data and domain knowledge for constructing COVID-19-specific -associated pathways.从单细胞组学数据和领域知识推断基因网络以构建新冠病毒特异性相关通路

Front Genet. 2023 Aug 31;14:1250545. doi: 10.3389/fgene.2023.1250545. eCollection 2023.

Data analysis methods for defining biomarkers from omics data.用于从组学数据中定义生物标志物的数据分析方法。

Anal Bioanal Chem. 2022 Jan;414(1):235-250. doi: 10.1007/s00216-021-03813-7. Epub 2021 Dec 24.

Identification of HCC-Related Genes Based on Differential Partial Correlation Network.基于差异偏相关网络的肝癌相关基因鉴定

Front Genet. 2021 Jul 15;12:672117. doi: 10.3389/fgene.2021.672117. eCollection 2021.

A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data.一种基于新型约束遗传算法的从稳态基因表达数据推断布尔网络的方法。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i383-i391. doi: 10.1093/bioinformatics/btab295.

Hybrid Functional Brain Network With First-Order and Second-Order Information for Computer-Aided Diagnosis of Schizophrenia.具有一阶和二阶信息的混合功能脑网络用于精神分裂症的计算机辅助诊断

Front Neurosci. 2019 Jun 14;13:603. doi: 10.3389/fnins.2019.00603. eCollection 2019.

Visualization and Interpretation of Multivariate Associations with Disease Risk Markers and Disease Risk-The Triplot.疾病风险标志物与疾病风险的多变量关联的可视化与解读——三线图

Metabolites. 2019 Jul 6;9(7):133. doi: 10.3390/metabo9070133.

INDEED: R package for network based differential expression analysis.INDEED：用于基于网络的差异表达分析的R包。

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2018 Dec;2018:2709-2712. doi: 10.1109/BIBM.2018.8621426. Epub 2019 Jan 24.

Maize network analysis revealed gene modules involved in development, nutrients utilization, metabolism, and stress response.玉米网络分析揭示了参与发育、养分利用、新陈代谢和应激反应的基因模块。

BMC Plant Biol. 2017 Aug 1;17(1):131. doi: 10.1186/s12870-017-1077-4.

Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size.针对样本量有限的大型共表达网络的调控推理评估与改进

BMC Syst Biol. 2017 Jun 19;11(1):62. doi: 10.1186/s12918-017-0440-2.

本文引用的文献

LC-MS based serum metabolomics for identification of hepatocellular carcinoma biomarkers in Egyptian cohort.基于 LC-MS 的血清代谢组学鉴定埃及队列中肝细胞癌的生物标志物。

J Proteome Res. 2012 Dec 7;11(12):5914-23. doi: 10.1021/pr300673x. Epub 2012 Nov 1.

Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks.结合偏相关和信息论方法进行基因共表达网络的反向工程。

Bioinformatics. 2008 Nov 1;24(21):2491-7. doi: 10.1093/bioinformatics/btn482. Epub 2008 Sep 10.

Sparse inverse covariance estimation with the graphical lasso.使用图模型选择法进行稀疏逆协方差估计。

Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.

Network motifs: theory and experimental approaches.网络基序：理论与实验方法

Nat Rev Genet. 2007 Jun;8(6):450-61. doi: 10.1038/nrg2102.

Review: on the analysis and interpretation of correlations in metabolomic data.综述：代谢组学数据相关性的分析与解读

Brief Bioinform. 2006 Jun;7(2):151-8. doi: 10.1093/bib/bbl009. Epub 2006 May 11.

Low-order conditional independence graphs for inferring genetic networks.用于推断遗传网络的低阶条件独立图。

Stat Appl Genet Mol Biol. 2006;5:Article1. doi: 10.2202/1544-6115.1170. Epub 2006 Jan 4.

Estimating genomic coexpression networks using first-order conditional independence.使用一阶条件独立性估计基因组共表达网络。

Genome Biol. 2004;5(12):R100. doi: 10.1186/gb-2004-5-12-r100. Epub 2004 Nov 30.

Discovery of meaningful associations in genomic data using partial correlation coefficients.利用偏相关系数在基因组数据中发现有意义的关联。

Bioinformatics. 2004 Dec 12;20(18):3565-74. doi: 10.1093/bioinformatics/bth445. Epub 2004 Jul 29.

A gene-coexpression network for global discovery of conserved genetic modules.用于全面发现保守遗传模块的基因共表达网络。

Science. 2003 Oct 10;302(5643):249-55. doi: 10.1126/science.1087447. Epub 2003 Aug 21.

Specificity and stability in topology of protein networks.蛋白质网络拓扑结构的特异性与稳定性。

Science. 2002 May 3;296(5569):910-3. doi: 10.1126/science.1065103.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验