Laboratory for Information and Decision Systems and Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Bioinformatics. 2021 Sep 29;37(18):3067-3069. doi: 10.1093/bioinformatics/btab167.
Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale gene expression datasets from different conditions, cell types, disease states, and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e. edges that appeared, disappeared, or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions.
Python package freely available at http://uhlerlab.github.io/causaldag/dci.
Supplementary data are available at Bioinformatics online.
为了控制基因调控,需要通过因果图对基因调控网络进行建模。目前,正在收集来自不同条件、细胞类型、疾病状态和发育时间点的大规模基因表达数据集。然而,基于这些数据应用经典因果推理算法来推断基因调控网络仍然具有挑战性,需要大量的样本和计算资源。在这里,我们描述了一种能够有效学习不同条件下基因调控机制差异的算法。我们的差异因果推理(DCI)算法根据两种条件下的基因表达数据推断两个因果图之间的差异(即出现、消失或权重变化的边)。该算法在样本和计算使用方面效率很高,因为它直接推断因果图之间的差异,而无需分别估计每个可能较大的因果图。我们提供了一个用户友好的 Python 实现 DCI,并允许用户通过稳定性选择学习不同调参下最稳健的差异因果图。最后,我们展示了如何将 DCI 应用于来自不同条件和细胞状态的单细胞 RNA-seq 数据,并通过预测干预效果来验证我们的算法。
可在 http://uhlerlab.github.io/causaldag/dci 上免费获得 Python 包。
补充数据可在生物信息学在线获得。