Suppr超能文献

MRPC:一个用于因果图推断的R软件包。

MRPC: An R Package for Inference of Causal Graphs.

作者信息

Badsha Md Bahadur, Martin Evan A, Fu Audrey Qiuyan

机构信息

Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, United States.

The Graduate Program in Bioinformatics and Computational Biology, University of Idaho, Moscow, ID, United States.

出版信息

Front Genet. 2021 Apr 30;12:651812. doi: 10.3389/fgene.2021.651812. eCollection 2021.

Abstract

Understanding the causal relationships between variables is a central goal of many scientific inquiries. Causal relationships may be represented by directed edges in a graph (or equivalently, a network). In biology, for example, gene regulatory networks may be viewed as a type of causal networks, where X→Y represents gene X regulating (i.e., being causal to) gene Y. However, existing general-purpose graph inference methods often result in a high number of false edges, whereas current causal inference methods developed for observational data in genomics can handle only limited types of causal relationships. We present MRPC (a PC algorithm with the principle of Mendelian Randomization), an R package that learns causal graphs with improved accuracy over existing methods. Our algorithm builds on the powerful PC algorithm (named after its developers Peter Spirtes and Clark Glymour), a canonical algorithm in computer science for learning directed acyclic graphs. The improvements in MRPC result in increased accuracy in identifying v-structures (i.e., X→Y←Z), and robustness to how the nodes are arranged in the input data. In the special case of genomic data that contain genotypes and phenotypes (e.g., gene expression) at the individual level, MRPC incorporates the principle of Mendelian randomization as constraints on edge direction to help orient the edges. MRPC allows for inference of causal graphs not only for general purposes, but also for biomedical data where multiple types of data may be input to provide evidence for causality. The R package is available on CRAN and is a free open-source software package under a GPL (≥2) license.

摘要

理解变量之间的因果关系是许多科学探究的核心目标。因果关系可以用图(或等效地,网络)中的有向边来表示。例如,在生物学中,基因调控网络可以被视为一种因果网络,其中X→Y表示基因X调控(即对……有因果关系)基因Y。然而,现有的通用图推理方法往往会产生大量错误的边,而目前为基因组学观测数据开发的因果推理方法只能处理有限类型的因果关系。我们提出了MRPC(一种具有孟德尔随机化原理的PC算法),这是一个R包,它能比现有方法更准确地学习因果图。我们的算法基于强大的PC算法(以其开发者彼得·斯皮尔斯和克拉克·格利穆尔命名)构建,这是计算机科学中用于学习有向无环图的一种规范算法。MRPC的改进提高了识别v结构(即X→Y←Z)的准确性,以及对输入数据中节点排列方式的鲁棒性。在包含个体水平的基因型和表型(如基因表达)的基因组数据的特殊情况下,MRPC将孟德尔随机化原理纳入作为边方向的约束,以帮助确定边(的方向)。MRPC不仅允许进行通用的因果图推理,还适用于生物医学数据,在这种数据中可以输入多种类型的数据来为因果关系提供证据。这个R包可以在CRAN上获取,并且是一个遵循GPL(≥2)许可的免费开源软件包

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a974/8120292/d9b6509aa0b9/fgene-12-651812-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验