• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 k-最近邻互信息估计的基因调控网络推断:重新审视一个旧的 DREAM。

Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM.

机构信息

Department of Biophysics, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD, 21218, USA.

10x Genomics, 6230 Stoneridge Mall Road, Pleasanton, CA, 94588-3260, USA.

出版信息

BMC Bioinformatics. 2023 Mar 6;24(1):84. doi: 10.1186/s12859-022-05047-5.

DOI:10.1186/s12859-022-05047-5
PMID:36879188
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9990267/
Abstract

BACKGROUND

A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization.

RESULTS

In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution using k-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov-Stoögbauer-Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods.

CONCLUSIONS

Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction-which combines CMIA, and the KSG-MI estimator-achieves an improvement of 20-35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations.

摘要

背景

细胞对外界和内部信号会产生多种反应。这些反应之所以成为可能,部分原因在于每个细胞中都存在着一个精心设计的基因调控网络(GRN)。在过去的 20 年中,许多研究小组使用各种推断算法,从大规模基因表达数据中重建 GRN 的拓扑结构。对 GRN 中参与调控的分子的深入了解最终可能带来治疗上的益处。互信息(MI)是推断/重建管道中常用的指标,因为它可以检测任意数量变量(n 维)之间的任何线性和非线性相关。然而,使用 MI 处理连续数据(例如,基因表达水平的归一化荧光强度测量)时,数据大小、相关性强度和基础分布都会对其产生影响,通常需要进行繁琐的、有时是特定的优化。

结果

在这项工作中,我们首先证明,与常用的基于固定分箱的方法相比,使用 k-最近邻(kNN)MI 估计来估计双变量和三变量高斯分布的 MI 会显著减少误差。其次,我们证明,实现基于 MI 的 kNN Kraskov-Stoögbauer-Grassberger(KSG)算法会显著提高流行推断算法(如上下文亲缘关系似然(CLR))的 GRN 重建。最后,通过广泛的模拟基准测试,我们表明,一种新的推断算法 CMIA(条件互信息增强),受 CLR 的启发,与 KSG-MI 估计器相结合,在精度-召回率方面优于常用方法。

结论

使用包含 15 个合成网络的三个典型数据集,新开发的 GRN 重建方法——结合了 CMIA 和 KSG-MI 估计器——在精度-召回率方面比当前该领域的黄金标准提高了 20-35%。这种新方法将使研究人员能够发现新的基因相互作用,或者更好地选择候选基因进行实验验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/75bf09e4d6c1/12859_2022_5047_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/fd697eb7a899/12859_2022_5047_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/7c7af91d3bd0/12859_2022_5047_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/c2aebf7ba1a4/12859_2022_5047_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/19b0c2727431/12859_2022_5047_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/422040776a19/12859_2022_5047_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/c285a8cf2fa8/12859_2022_5047_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/d3c46f769e99/12859_2022_5047_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/edba65038dc9/12859_2022_5047_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/75bf09e4d6c1/12859_2022_5047_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/fd697eb7a899/12859_2022_5047_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/7c7af91d3bd0/12859_2022_5047_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/c2aebf7ba1a4/12859_2022_5047_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/19b0c2727431/12859_2022_5047_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/422040776a19/12859_2022_5047_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/c285a8cf2fa8/12859_2022_5047_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/d3c46f769e99/12859_2022_5047_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/edba65038dc9/12859_2022_5047_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2a5/9990267/75bf09e4d6c1/12859_2022_5047_Fig9_HTML.jpg

相似文献

1
Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM.基于 k-最近邻互信息估计的基因调控网络推断:重新审视一个旧的 DREAM。
BMC Bioinformatics. 2023 Mar 6;24(1):84. doi: 10.1186/s12859-022-05047-5.
2
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
3
Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.时滞信息论方法在基因调控网络反向工程中的应用。
BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S19. doi: 10.1186/1471-2105-11-S6-S19.
4
Inference of Gene Regulatory Network Based on Local Bayesian Networks.基于局部贝叶斯网络的基因调控网络推理
PLoS Comput Biol. 2016 Aug 1;12(8):e1005024. doi: 10.1371/journal.pcbi.1005024. eCollection 2016 Aug.
5
BMRF-MI: integrative identification of protein interaction network by modeling the gene dependency.BMRF-MI:通过对基因依赖性进行建模来综合识别蛋白质相互作用网络。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S10. doi: 10.1186/1471-2164-16-S7-S10. Epub 2015 Jun 11.
6
Inferring single-cell gene regulatory network by non-redundant mutual information.通过非冗余互信息推断单细胞基因调控网络。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad326.
7
Inference of gene networks from gene expression time series using recurrent neural networks and sparse MAP estimation.使用递归神经网络和稀疏最大后验估计从基因表达时间序列推断基因网络。
J Bioinform Comput Biol. 2018 Aug;16(4):1850009. doi: 10.1142/S0219720018500099. Epub 2018 Apr 26.
8
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.基于距离相关性从基因表达数据推断非线性基因调控网络。
PLoS One. 2014 Feb 14;9(2):e87446. doi: 10.1371/journal.pone.0087446. eCollection 2014.
9
An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection.基于候选自动选择的基因调控网络重建的改进贝叶斯网络方法。
BMC Genomics. 2017 Nov 17;18(Suppl 9):844. doi: 10.1186/s12864-017-4228-y.
10
BRANE Cut: biologically-related a priori network enhancement with graph cuts for gene regulatory network inference.BRANE Cut:用于基因调控网络推断的基于图割的生物学相关先验网络增强
BMC Bioinformatics. 2015 Nov 4;16:368. doi: 10.1186/s12859-015-0754-2.

引用本文的文献

1
GRLGRN: graph representation-based learning to infer gene regulatory networks from single-cell RNA-seq data.GRLGRN:基于图表示的学习方法,用于从单细胞RNA测序数据推断基因调控网络。
BMC Bioinformatics. 2025 Apr 18;26(1):108. doi: 10.1186/s12859-025-06116-1.
2
NetSci: A Library for High Performance Biomolecular Simulation Network Analysis Computation.NetSci:一个用于高性能生物分子模拟网络分析计算的库。
J Chem Inf Model. 2024 Oct 28;64(20):7966-7976. doi: 10.1021/acs.jcim.4c00899. Epub 2024 Oct 4.
3
A review of model evaluation metrics for machine learning in genetics and genomics.

本文引用的文献

1
KFGRNI: A robust method to inference gene regulatory network from time-course gene data based on ensemble Kalman filter.KFGRNI:一种基于集合卡尔曼滤波的从时间序列基因数据中推断基因调控网络的稳健方法。
J Bioinform Comput Biol. 2021 Apr;19(2):2150002. doi: 10.1142/S0219720021500025. Epub 2021 Mar 3.
2
A comprehensive overview and critical evaluation of gene regulatory network inference technologies.基因调控网络推断技术的全面概述和批判性评估。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab009.
3
Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data.
遗传学和基因组学中机器学习模型评估指标综述。
Front Bioinform. 2024 Sep 10;4:1457619. doi: 10.3389/fbinf.2024.1457619. eCollection 2024.
4
Optimal entropic properties of SARS-CoV-2 RNA sequences.新型冠状病毒2(SARS-CoV-2)RNA序列的最佳熵特性
R Soc Open Sci. 2024 Jan 31;11(1):231369. doi: 10.1098/rsos.231369. eCollection 2024 Jan.
5
Predicting gene regulatory links from single-cell RNA-seq data using graph neural networks.利用图神经网络从单细胞 RNA-seq 数据中预测基因调控关系。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad414.
6
An online soft sensor method for biochemical reaction process based on JS-ISSA-XGBoost.基于 JS-ISSA-XGBoost 的生化反应过程在线软测量方法。
BMC Biotechnol. 2023 Nov 8;23(1):49. doi: 10.1186/s12896-023-00816-3.
基于单细胞转录组数据的基因调控网络推断算法的基准测试。
Nat Methods. 2020 Feb;17(2):147-154. doi: 10.1038/s41592-019-0690-6. Epub 2020 Jan 6.
4
RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12.RegulonDB v 10.5:应对挑战,统一大肠杆菌 K-12 中经典和高通量基因调控知识。
Nucleic Acids Res. 2019 Jan 8;47(D1):D212-D220. doi: 10.1093/nar/gky1077.
5
A robust gene regulatory network inference method base on Kalman filter and linear regression.基于卡尔曼滤波和线性回归的稳健基因调控网络推断方法。
PLoS One. 2018 Jul 12;13(7):e0200094. doi: 10.1371/journal.pone.0200094. eCollection 2018.
6
Understanding Biological Regulation Through Synthetic Biology.通过合成生物学理解生物调控。
Annu Rev Biophys. 2018 May 20;47:399-423. doi: 10.1146/annurev-biophys-070816-033903. Epub 2018 Mar 16.
7
dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data.dynGENIE3:用于从时间序列表达数据中推断基因网络的动态 GENIE3。
Sci Rep. 2018 Feb 21;8(1):3384. doi: 10.1038/s41598-018-21715-0.
8
Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures.基于多元信息测度的单细胞数据基因调控网络推断
Cell Syst. 2017 Sep 27;5(3):251-267.e3. doi: 10.1016/j.cels.2017.08.014.
9
Information theory in systems biology. Part I: Gene regulatory and metabolic networks.系统生物学中的信息理论。第一部分:基因调控网络与代谢网络。
Semin Cell Dev Biol. 2016 Mar;51:3-13. doi: 10.1016/j.semcdb.2015.12.007. Epub 2015 Dec 14.
10
Large differences in global transcriptional regulatory programs of normal and tumor colon cells.正常结肠细胞与肿瘤结肠细胞的全球转录调控程序存在巨大差异。
BMC Cancer. 2014 Sep 24;14:708. doi: 10.1186/1471-2407-14-708.