• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多变量时间序列数据学习局部有向无环图

LEARNING LOCAL DIRECTED ACYCLIC GRAPHS BASED ON MULTIVARIATE TIME SERIES DATA.

作者信息

Deng Wanlu, Geng Zhi, Li Hongzhe

机构信息

Department of Statistics and Probability, Peking University, Beijing 100871, PR China. Department of Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA.

出版信息

Ann Appl Stat. 2013;7(3):1249-1835. doi: 10.1214/13-aoas635.

DOI:10.1214/13-aoas635
PMID:24465291
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3898602/
Abstract

Multivariate time series (MTS) data such as time course gene expression data in genomics are often collected to study the dynamic nature of the systems. These data provide important information about the causal dependency among a set of random variables. In this paper, we introduce a computationally efficient algorithm to learn directed acyclic graphs (DAGs) based on MTS data, focusing on learning the local structure of a given target variable. Our algorithm is based on learning all parents (P), all children (C) and some descendants (D) (PCD) iteratively, utilizing the time order of the variables to orient the edges. This time series PCD-PCD algorithm (tsPCD-PCD) extends the previous PCD-PCD algorithm to dependent observations and utilizes composite likelihood ratio tests (CLRTs) for testing the conditional independence. We present the asymptotic distribution of the CLRT statistic and show that the tsPCD-PCD is guaranteed to recover the true DAG structure when the faithfulness condition holds and the tests correctly reject the null hypotheses. Simulation studies show that the CLRTs are valid and perform well even when the sample sizes are small. In addition, the tsPCD-PCD algorithm outperforms the PCD-PCD algorithm in recovering the local graph structures. We illustrate the algorithm by analyzing a time course gene expression data related to mouse T-cell activation.

摘要

多变量时间序列(MTS)数据,如基因组学中的时间进程基因表达数据,通常被收集用于研究系统的动态特性。这些数据提供了关于一组随机变量之间因果依赖性的重要信息。在本文中,我们介绍一种计算效率高的算法,用于基于MTS数据学习有向无环图(DAG),重点是学习给定目标变量的局部结构。我们的算法基于迭代学习所有父节点(P)、所有子节点(C)和一些后代节点(D)(PCD),利用变量的时间顺序来确定边的方向。这种时间序列PCD - PCD算法(tsPCD - PCD)将先前的PCD - PCD算法扩展到相关观测,并利用复合似然比检验(CLRT)来检验条件独立性。我们给出了CLRT统计量的渐近分布,并表明当忠实性条件成立且检验正确拒绝原假设时,tsPCD - PCD能够保证恢复真实的DAG结构。模拟研究表明,即使样本量较小,CLRT也是有效的且性能良好。此外,在恢复局部图结构方面,tsPCD - PCD算法优于PCD - PCD算法。我们通过分析与小鼠T细胞激活相关的时间进程基因表达数据来说明该算法。

相似文献

1
LEARNING LOCAL DIRECTED ACYCLIC GRAPHS BASED ON MULTIVARIATE TIME SERIES DATA.基于多变量时间序列数据学习局部有向无环图
Ann Appl Stat. 2013;7(3):1249-1835. doi: 10.1214/13-aoas635.
2
Inferring Regulatory Networks From Mixed Observational Data Using Directed Acyclic Graphs.使用有向无环图从混合观测数据中推断调控网络
Front Genet. 2020 Feb 7;11:8. doi: 10.3389/fgene.2020.00008. eCollection 2020.
3
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data.利用混合效应结构方程模型从观测数据中学习特定主题的有向无环图
Front Genet. 2018 Oct 2;9:430. doi: 10.3389/fgene.2018.00430. eCollection 2018.
4
A million variables and more: the Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images.数百万甚至更多的变量:用于学习高维图形因果模型的快速贪婪等价搜索算法,并应用于功能磁共振成像。
Int J Data Sci Anal. 2017 Mar;3(2):121-129. doi: 10.1007/s41060-016-0032-z. Epub 2016 Dec 1.
5
Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.用于估计稀疏高维有向无环图的惩罚似然方法。
Biometrika. 2010 Sep;97(3):519-538. doi: 10.1093/biomet/asq038. Epub 2010 Jul 9.
6
Estimation of sparse directed acyclic graphs for multivariate counts data.多元计数数据的稀疏有向无环图估计
Biometrics. 2016 Sep;72(3):791-803. doi: 10.1111/biom.12467. Epub 2016 Feb 5.
7
Dynamic Uncertain Causality Graph for Knowledge Representation and Probabilistic Reasoning: Directed Cyclic Graph and Joint Probability Distribution.动态不确定因果关系图用于知识表示和概率推理:有向循环图和联合概率分布。
IEEE Trans Neural Netw Learn Syst. 2015 Jul;26(7):1503-17. doi: 10.1109/TNNLS.2015.2402162. Epub 2015 Mar 12.
8
Estimation of Directed Acyclic Graphs Through Two-stage Adaptive Lasso for Gene Network Inference.基于两阶段自适应套索法的有向无环图估计在基因网络推断中的应用
J Am Stat Assoc. 2016;111(515):1004-1019. doi: 10.1080/01621459.2016.1142880. Epub 2016 Oct 18.
9
A Gibbs Sampler for Learning DAGs.一种用于学习有向无环图的吉布斯采样器。
J Mach Learn Res. 2016 Apr;17(30):1-39.
10
The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks.简化的PC算法:大型随机网络中因果结构学习的改进
J Mach Learn Res. 2019;20(164).

本文引用的文献

1
BOOTSTRAP INFERENCE FOR NETWORK CONSTRUCTION WITH AN APPLICATION TO A BREAST CANCER MICROARRAY STUDY.用于网络构建的自助推断及其在乳腺癌微阵列研究中的应用
Ann Appl Stat. 2013 Mar 1;7(1):391-417. doi: 10.1214/12-AOAS589.
2
Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase.单细胞表达分析在细胞重编程过程中揭示了早期的随机和晚期的层次阶段。
Cell. 2012 Sep 14;150(6):1209-22. doi: 10.1016/j.cell.2012.08.023.
3
CDK-mediated regulation of cell functions via c-Jun phosphorylation and AP-1 activation.CDK 通过磷酸化 c-Jun 和激活 AP-1 调节细胞功能。
PLoS One. 2011 Apr 29;6(4):e19468. doi: 10.1371/journal.pone.0019468.
4
Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes.时变基因调控网络重建的改进:通过基因间信息共享的动态规划和正则化。
Bioinformatics. 2011 Mar 1;27(5):693-9. doi: 10.1093/bioinformatics/btq711. Epub 2010 Dec 21.
5
The importance of LAT in the activation, homeostasis, and regulatory function of T cells.LAT 在 T 细胞的激活、稳态和调节功能中的重要性。
J Biol Chem. 2010 Nov 12;285(46):35393-405. doi: 10.1074/jbc.M110.145052. Epub 2010 Sep 13.
6
An empirical Bayesian method for estimating biological networks from temporal microarray data.一种从时间微阵列数据估计生物网络的经验贝叶斯方法。
Stat Appl Genet Mol Biol. 2010;9:Article 9. doi: 10.2202/1544-6115.1513. Epub 2010 Jan 15.
7
Fibronectin stimulates non-small cell lung carcinoma cell growth through activation of Akt/mammalian target of rapamycin/S6 kinase and inactivation of LKB1/AMP-activated protein kinase signal pathways.纤连蛋白通过激活Akt/雷帕霉素哺乳动物靶蛋白/S6激酶以及使LKB1/AMP激活的蛋白激酶信号通路失活来刺激非小细胞肺癌细胞生长。
Cancer Res. 2006 Jan 1;66(1):315-23. doi: 10.1158/0008-5472.CAN-05-2367.
8
Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks.用于稀疏高斯浓度图的梯度导向正则化及其在遗传网络推断中的应用
Biostatistics. 2006 Apr;7(2):302-17. doi: 10.1093/biostatistics/kxj008. Epub 2005 Dec 2.
9
An empirical Bayes approach to inferring large-scale gene association networks.一种用于推断大规模基因关联网络的经验贝叶斯方法。
Bioinformatics. 2005 Mar;21(6):754-64. doi: 10.1093/bioinformatics/bti062. Epub 2004 Oct 12.
10
Modeling T-cell activation using gene expression profiling and state-space models.利用基因表达谱和状态空间模型对T细胞活化进行建模。
Bioinformatics. 2004 Jun 12;20(9):1361-72. doi: 10.1093/bioinformatics/bth093. Epub 2004 Feb 12.