• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 FDR 控制的包含信息的基因网络构建

Information-incorporated gene network construction with FDR control.

机构信息

Department of Statistics, Iowa State University, Ames, IA 50010, United States.

Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50010, United States.

出版信息

Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae125.

DOI:10.1093/bioinformatics/btae125
PMID:38430463
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10937901/
Abstract

MOTIVATION

Large-scale gene expression studies allow gene network construction to uncover associations among genes. To study direct associations among genes, partial correlation-based networks are preferred over marginal correlations. However, FDR control for partial correlation-based network construction is not well-studied. In addition, currently available partial correlation-based methods cannot take existing biological knowledge to help network construction while controlling FDR.

RESULTS

In this paper, we propose a method called Partial Correlation Graph with Information Incorporation (PCGII). PCGII estimates partial correlations between each pair of genes by regularized node-wise regression that can incorporate prior knowledge while controlling the effects of all other genes. It handles high-dimensional data where the number of genes can be much larger than the sample size and controls FDR at the same time. We compare PCGII with several existing approaches through extensive simulation studies and demonstrate that PCGII has better FDR control and higher power. We apply PCGII to a plant gene expression dataset where it recovers confirmed regulatory relationships and a hub node, as well as several direct associations that shed light on potential functional relationships in the system. We also introduce a method to supplement observed data with a pseudogene to apply PCGII when no prior information is available, which also allows checking FDR control and power for real data analysis.

AVAILABILITY AND IMPLEMENTATION

R package is freely available for download at https://cran.r-project.org/package=PCGII.

摘要

动机

大规模基因表达研究允许构建基因网络,以揭示基因之间的关联。为了研究基因之间的直接关联,优选基于部分相关性的网络而不是边际相关性。然而,基于部分相关性的网络构建的 FDR 控制尚未得到很好的研究。此外,目前可用的基于部分相关性的方法在控制 FDR 的同时,无法利用现有的生物学知识来帮助网络构建。

结果

在本文中,我们提出了一种称为带有信息整合的部分相关图(PCGII)的方法。PCGII 通过正则化节点的回归来估计每对基因之间的部分相关性,该回归可以在控制所有其他基因影响的同时整合先验知识。它处理高维数据,其中基因的数量可以远大于样本量,并同时控制 FDR。我们通过广泛的模拟研究将 PCGII 与几种现有方法进行了比较,结果表明 PCGII 具有更好的 FDR 控制和更高的功效。我们将 PCGII 应用于植物基因表达数据集,其中它恢复了确认的调节关系和一个枢纽节点,以及几个直接关联,这些关联揭示了系统中潜在的功能关系。我们还介绍了一种方法,通过添加假基因来补充观测数据,以便在没有先验信息时应用 PCGII,这也允许检查真实数据分析的 FDR 控制和功效。

可用性和实现

R 包可在 https://cran.r-project.org/package=PCGII 上免费下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/5128670ba06d/btae125f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/9c951b9a0b65/btae125f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/b2e0c7605451/btae125f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/5128670ba06d/btae125f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/9c951b9a0b65/btae125f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/b2e0c7605451/btae125f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1fa/10937901/5128670ba06d/btae125f3.jpg

相似文献

1
Information-incorporated gene network construction with FDR control.基于 FDR 控制的包含信息的基因网络构建
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae125.
2
SILGGM: An extensive R package for efficient statistical inference in large-scale gene networks.SILGGM:一个用于大规模基因网络中高效统计推断的扩展 R 包。
PLoS Comput Biol. 2018 Aug 13;14(8):e1006369. doi: 10.1371/journal.pcbi.1006369. eCollection 2018 Aug.
3
wTO: an R package for computing weighted topological overlap and a consensus network with integrated visualization tool.wTO:一个用于计算加权拓扑重叠和共识网络的 R 包,具有集成的可视化工具。
BMC Bioinformatics. 2018 Oct 24;19(1):392. doi: 10.1186/s12859-018-2351-7.
4
Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.在RNA测序实验的差异表达分析中控制错误发现率时的样本量计算。
BMC Bioinformatics. 2016 Mar 31;17:146. doi: 10.1186/s12859-016-0994-9.
5
Identifying differentially expressed genes using false discovery rate controlling procedures.使用错误发现率控制程序识别差异表达基因。
Bioinformatics. 2003 Feb 12;19(3):368-75. doi: 10.1093/bioinformatics/btf877.
6
Global FDR control across multiple RNAseq experiments.跨多个 RNAseq 实验的全局 FDR 控制。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac718.
7
Knockoff boosted tree for model-free variable selection.无模型变量选择的仿射提升树。
Bioinformatics. 2021 May 17;37(7):976-983. doi: 10.1093/bioinformatics/btaa770.
8
Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size.针对样本量有限的大型共表达网络的调控推理评估与改进
BMC Syst Biol. 2017 Jun 19;11(1):62. doi: 10.1186/s12918-017-0440-2.
9
Detecting subnetwork-level dynamic correlations.检测子网级动态相关性。
Bioinformatics. 2017 Jan 15;33(2):256-265. doi: 10.1093/bioinformatics/btw616. Epub 2016 Sep 25.
10
PCIT: an R package for weighted gene co-expression networks based on partial correlation and information theory approaches.PCIT:一个基于偏相关和信息理论方法的加权基因共表达网络的 R 包。
Bioinformatics. 2010 Feb 1;26(3):411-3. doi: 10.1093/bioinformatics/btp674. Epub 2009 Dec 9.

引用本文的文献

1
Weighted overlapping group lasso for integrating prior network knowledge into gene set analysis.用于将先验网络知识整合到基因集分析中的加权重叠组套索法。
BMC Bioinformatics. 2025 Sep 1;26(1):226. doi: 10.1186/s12859-025-06170-9.

本文引用的文献

1
Integrated omics reveal novel functions and underlying mechanisms of the receptor kinase FERONIA in Arabidopsis thaliana.整合组学揭示受体激酶 FERONIA 在拟南芥中的新功能和潜在机制。
Plant Cell. 2022 Jul 4;34(7):2594-2614. doi: 10.1093/plcell/koac111.
2
Information-incorporated Gaussian graphical model for gene expression data.基于信息的基因表达数据高斯图模型。
Biometrics. 2022 Jun;78(2):512-523. doi: 10.1111/biom.13428. Epub 2021 Feb 12.
3
A strategy to incorporate prior knowledge into correlation network cutoff selection.
将先验知识纳入相关网络截断选择的策略。
Nat Commun. 2020 Oct 14;11(1):5153. doi: 10.1038/s41467-020-18675-3.
4
Integrated multi-omics framework of the plant response to jasmonic acid.植物响应茉莉酸的综合多组学框架。
Nat Plants. 2020 Mar;6(3):290-302. doi: 10.1038/s41477-020-0605-7. Epub 2020 Mar 13.
5
Exact hypothesis testing for shrinkage-based Gaussian graphical models.基于收缩的高斯图模型的精确假设检验。
Bioinformatics. 2019 Dec 1;35(23):5011-5017. doi: 10.1093/bioinformatics/btz357.
6
Estimating c-level partial correlation graphs with application to brain imaging.用基于脑影像数据的应用案例估计 C 级偏相关图。
Biostatistics. 2020 Oct 1;21(4):641-658. doi: 10.1093/biostatistics/kxy076.
7
FERONIA Receptor Kinase Contributes to Plant Immunity by Suppressing Jasmonic Acid Signaling in Arabidopsis thaliana.FERONIA 受体激酶通过抑制拟南芥中的茉莉酸信号转导来促进植物免疫。
Curr Biol. 2018 Oct 22;28(20):3316-3324.e6. doi: 10.1016/j.cub.2018.07.078. Epub 2018 Sep 27.
8
Initiation of ER Body Formation and Indole Glucosinolate Metabolism by the Plastidial Retrograde Signaling Metabolite, MEcPP.叶绿体外逆行信号代谢产物 MEcPP 启动内质体形成和吲哚葡萄糖苷代谢。
Mol Plant. 2017 Nov 6;10(11):1400-1416. doi: 10.1016/j.molp.2017.09.012. Epub 2017 Sep 28.
9
Network propagation: a universal amplifier of genetic associations.网络传播:遗传关联的通用放大器。
Nat Rev Genet. 2017 Sep;18(9):551-562. doi: 10.1038/nrg.2017.38. Epub 2017 Jun 12.
10
A simulation framework for correlated count data of features subsets in high-throughput sequencing or proteomics experiments.一种用于高通量测序或蛋白质组学实验中特征子集相关计数数据的模拟框架。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):401-414. doi: 10.1515/sagmb-2015-0082.