Department of Computer Science, University of Pittsburgh, 4200 Fifth Avenue, Pittsburgh, PA 15260, USA.
Department of Computational and Systems Biology, University of Pittsburgh, 3420 Forbes Ave, Pittsburgh, PA 15213, USA.
Nucleic Acids Res. 2020 Jul 2;48(W1):W597-W602. doi: 10.1093/nar/gkaa350.
High-throughput sequencing and the availability of large online data repositories (e.g. The Cancer Genome Atlas and Trans-Omics for Precision Medicine) have the potential to revolutionize systems biology by enabling researchers to study interactions between data from different modalities (i.e. genetic, genomic, clinical, behavioral, etc.). Currently, data mining and statistical approaches are confined to identifying correlates in these datasets, but researchers are often interested in identifying cause-and-effect relationships. Causal discovery methods were developed to infer such cause-and-effect relationships from observational data. Though these algorithms have had demonstrated successes in several biomedical applications, they are difficult to use for non-experts. So, there is a need for web-based tools to make causal discovery methods accessible. Here, we present CausalMGM (http://causalmgm.org/), the first web-based causal discovery tool that enables researchers to find cause-and-effect relationships from observational data. Web-based CausalMGM consists of three data analysis tools: (i) feature selection and clustering; (ii) automated identification of cause-and-effect relationships via a graphical model; and (iii) interactive visualization of the learned causal (directed) graph. We demonstrate how CausalMGM enables an end-to-end exploratory analysis of biomedical datasets, giving researchers a clearer picture of its capabilities.
高通量测序和大型在线数据存储库(例如癌症基因组图谱和精准医学的跨组学)的出现,有可能通过使研究人员能够研究来自不同模式的数据(即遗传、基因组、临床、行为等)之间的相互作用,从而彻底改变系统生物学。目前,数据挖掘和统计方法仅限于识别这些数据集的相关性,但研究人员通常对识别因果关系感兴趣。因果发现方法是为了从观察数据中推断出这种因果关系而开发的。尽管这些算法在一些生物医学应用中已经取得了成功,但对于非专家来说,它们很难使用。因此,需要基于网络的工具来使因果发现方法易于使用。在这里,我们介绍了 CausalMGM(http://causalmgm.org/),这是第一个基于网络的因果发现工具,使研究人员能够从观察数据中找到因果关系。基于网络的 CausalMGM 由三个数据分析工具组成:(i)特征选择和聚类;(ii)通过图形模型自动识别因果关系;以及(iii)学习因果(有向)图的交互式可视化。我们展示了 CausalMGM 如何能够对生物医学数据集进行端到端的探索性分析,使研究人员更清楚地了解其功能。