• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

二分网络比简单网络更能体现因果关系:证据、算法及应用。

Bipartite networks represent causality better than simple networks: evidence, algorithms, and applications.

作者信息

Shen Bingran, Curozzi Gloria, Shasha Dennis

机构信息

Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, United States.

Center for Genomics and Systems Biology, Department of Biology, New York University, New York, United States.

出版信息

Front Genet. 2024 May 9;15:1371607. doi: 10.3389/fgene.2024.1371607. eCollection 2024.

DOI:10.3389/fgene.2024.1371607
PMID:38798697
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11120958/
Abstract

A network, whose nodes are genes and whose directed edges represent positive or negative influences of a regulatory gene and its targets, is often used as a representation of causality. To infer a network, researchers often develop a machine learning model and then evaluate the model based on its match with experimentally verified "gold standard" edges. The desired result of such a model is a network that may extend the gold standard edges. Since networks are a form of visual representation, one can compare their utility with architectural or machine blueprints. Blueprints are clearly useful because they provide precise guidance to builders in construction. If the primary role of gene regulatory networks is to characterize causality, then such networks should be good tools of prediction because prediction is the actionable benefit of knowing causality. But are they? In this paper, we compare prediction quality based on "gold standard" regulatory edges from previous experimental work with non-linear models inferred from time series data across four different species. We show that the same non-linear machine learning models have better predictive performance, with improvements from 5.3% to 25.3% in terms of the reduction in the root mean square error (RMSE) compared with the same models based on the gold standard edges. Having established that networks fail to characterize causality properly, we suggest that causality research should focus on four goals: (i) predictive accuracy; (ii) a parsimonious enumeration of predictive regulatory genes for each target gene ; (iii) the identification of disjoint sets of predictive regulatory genes for each target of roughly equal accuracy; and (iv) the construction of a bipartite network (whose node types are genes and models) representation of causality. We provide algorithms for all goals.

摘要

一种网络,其节点为基因,其有向边表示调控基因及其靶标的正向或负向影响,常被用作因果关系的一种表示形式。为了推断一个网络,研究人员通常会开发一个机器学习模型,然后根据其与经过实验验证的“金标准”边的匹配程度来评估该模型。这种模型的期望结果是一个可能扩展金标准边的网络。由于网络是一种视觉表示形式,人们可以将它们的效用与建筑蓝图或机器蓝图进行比较。蓝图显然很有用,因为它们在建筑施工中为建造者提供了精确的指导。如果基因调控网络的主要作用是表征因果关系,那么这样的网络应该是很好的预测工具,因为预测是了解因果关系可带来的实际益处。但它们是吗?在本文中,我们将基于先前实验工作中的“金标准”调控边的预测质量与从四个不同物种的时间序列数据推断出的非线性模型进行比较。我们表明,相同的非线性机器学习模型具有更好的预测性能,与基于金标准边的相同模型相比,均方根误差(RMSE)降低了5.3%至25.3%。在确定网络未能正确表征因果关系后,我们建议因果关系研究应聚焦于四个目标:(i)预测准确性;(ii)为每个靶基因简洁地列举预测性调控基因;(iii)为每个靶标识别出预测准确性大致相等的不相交的预测性调控基因集;(iv)构建因果关系的二分网络(其节点类型为基因和模型)表示形式。我们为所有目标提供了算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/f81ea93e5764/fgene-15-1371607-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/2a2b45c2e780/fgene-15-1371607-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/563352985fbc/fgene-15-1371607-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/04fb5e14d6d8/fgene-15-1371607-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/9662b065370c/fgene-15-1371607-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/f81ea93e5764/fgene-15-1371607-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/2a2b45c2e780/fgene-15-1371607-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/563352985fbc/fgene-15-1371607-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/04fb5e14d6d8/fgene-15-1371607-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/9662b065370c/fgene-15-1371607-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2921/11120958/f81ea93e5764/fgene-15-1371607-g005.jpg

相似文献

1
Bipartite networks represent causality better than simple networks: evidence, algorithms, and applications.二分网络比简单网络更能体现因果关系:证据、算法及应用。
Front Genet. 2024 May 9;15:1371607. doi: 10.3389/fgene.2024.1371607. eCollection 2024.
2
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
3
Corrigendum: Bipartite networks represent causality better than simple networks: Evidence, algorithms, and applications.勘误:二分网络比简单网络更能体现因果关系:证据、算法及应用。
Front Genet. 2024 Jun 18;15:1440665. doi: 10.3389/fgene.2024.1440665. eCollection 2024.
4
Comorbidity Scoring with Causal Disease Networks.合并症评分与因果疾病网络。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1627-1634. doi: 10.1109/TCBB.2018.2812886. Epub 2018 Mar 6.
5
Assessing the Effectiveness of Causality Inference Methods for Gene Regulatory Networks.评估基因调控网络因果推理方法的有效性。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):56-70. doi: 10.1109/TCBB.2018.2853728. Epub 2018 Jul 6.
6
Robust discovery of gene regulatory networks from single-cell gene expression data by Causal Inference Using Composition of Transactions.基于事务组合的因果推理从单细胞基因表达数据中稳健地发现基因调控网络。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad370.
7
Causal Artificial Intelligence Models of Food Quality Data.食品质量数据的因果人工智能模型。
Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.
8
Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.利用基因扰动推断具有稀疏结构方程模型的基因调控网络。
PLoS Comput Biol. 2013;9(5):e1003068. doi: 10.1371/journal.pcbi.1003068. Epub 2013 May 23.
9
Causal network inference from gene transcriptional time-series response to glucocorticoids.从基因转录时间序列对糖皮质激素的响应中推断因果关系网络。
PLoS Comput Biol. 2021 Jan 29;17(1):e1008223. doi: 10.1371/journal.pcbi.1008223. eCollection 2021 Jan.
10
Windowed Granger causal inference strategy improves discovery of gene regulatory networks.窗口格兰杰因果推断策略提高了基因调控网络的发现能力。
Proc Natl Acad Sci U S A. 2018 Feb 27;115(9):2252-2257. doi: 10.1073/pnas.1710936115. Epub 2018 Feb 12.

引用本文的文献

1
Optimizing data integration improves gene regulatory network inference in Arabidopsis thaliana.优化数据集成可提高拟南芥基因调控网络推断。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae415.

本文引用的文献

1
MYC disrupts transcriptional and metabolic circadian oscillations in cancer and promotes enhanced biosynthesis.MYC 扰乱癌症中的转录和代谢昼夜节律振荡,并促进增强的生物合成。
PLoS Genet. 2023 Aug 28;19(8):e1010904. doi: 10.1371/journal.pgen.1010904. eCollection 2023 Aug.
2
The transcriptional regulator Ume6 is a major driver of early gene expression during gametogenesis.转录调节因子 Ume6 是配子发生过程中早期基因表达的主要驱动因子。
Genetics. 2023 Oct 4;225(2). doi: 10.1093/genetics/iyad123.
3
YEASTRACT+: a portal for the exploitation of global transcription regulation and metabolic model data in yeast biotechnology and pathogenesis.
YEASTRACT+:一个用于开发酵母生物技术和发病机制中全球转录调控和代谢模型数据的门户。
Nucleic Acids Res. 2023 Jan 6;51(D1):D785-D791. doi: 10.1093/nar/gkac1041.
4
High-performance single-cell gene regulatory network inference at scale: the Inferelator 3.0.大规模高性能单细胞基因调控网络推断:Inferelator 3.0。
Bioinformatics. 2022 Apr 28;38(9):2519-2528. doi: 10.1093/bioinformatics/btac117.
5
Ultradian rhythms of AKT phosphorylation and gene expression emerge in the absence of the circadian clock components Per1 and Per2.昼夜节律钟成分 Per1 和 Per2 缺失时,AKT 磷酸化和基因表达出现超昼夜节律。
PLoS Biol. 2021 Dec 30;19(12):e3001492. doi: 10.1371/journal.pbio.3001492. eCollection 2021 Dec.
6
Cell-Cycle-Dependent Chromatin Dynamics at Replication Origins.复制起始点处细胞周期依赖的染色质动力学
Genes (Basel). 2021 Dec 16;12(12):1998. doi: 10.3390/genes12121998.
7
The current state of SubtiWiki, the database for the model organism Bacillus subtilis.苏提维基(SubtiWiki),枯草芽孢杆菌模式生物数据库的现状。
Nucleic Acids Res. 2022 Jan 7;50(D1):D875-D882. doi: 10.1093/nar/gkab943.
8
RoboCOP: jointly computing chromatin occupancy profiles for numerous factors from chromatin accessibility data.RoboCOP:从染色质可及性数据中联合计算多个因子的染色质占有率图谱。
Nucleic Acids Res. 2021 Aug 20;49(14):7925-7938. doi: 10.1093/nar/gkab553.
9
Linking the dynamics of chromatin occupancy and transcription with predictive models.将染色质占据动态与预测模型联系起来。
Genome Res. 2021 Jun;31(6):1035-1046. doi: 10.1101/gr.267237.120. Epub 2021 Apr 23.
10
Granger-causal testing for irregularly sampled time series with application to nitrogen signalling in Arabidopsis.对不规则采样时间序列进行格兰杰因果检验及其在拟南芥氮信号传导中的应用。
Bioinformatics. 2021 Aug 25;37(16):2450-2460. doi: 10.1093/bioinformatics/btab126.