Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, United States.
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, United States.
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae463.
Many types of networks, such as co-expression or ChIP-seq-based gene-regulatory networks, provide useful information for biomedical studies. However, they are often too full of connections and difficult to interpret, forming "indecipherable hairballs."
To address this issue, we propose that a Bayesian network can summarize the core relationships between gene expression activities. This network, which we call the LatentDAG, is substantially simpler than conventional co-expression network and ChIP-seq networks (by two orders of magnitude). It provides clearer clusters, without extraneous cross-cluster connections, and clear separators between modules. Moreover, one can find a number of clear examples showing how it bridges the connection between steps in the transcriptional regulatory network and other networks (e.g. RNA-binding protein). In conjunction with a graph neural network, the LatentDAG works better than other biological networks in a variety of tasks, including prediction of gene conservation and clustering genes.
Code is available at https://github.com/gersteinlab/LatentDAG.
许多类型的网络,如共表达或基于 ChIP-seq 的基因调控网络,为生物医学研究提供了有用的信息。然而,它们通常充满了太多的连接,难以解释,形成了“难以理解的毛发球”。
为了解决这个问题,我们提出贝叶斯网络可以总结基因表达活动之间的核心关系。这个网络,我们称之为 LatentDAG,比传统的共表达网络和 ChIP-seq 网络(相差两个数量级)要简单得多。它提供了更清晰的聚类,没有多余的跨聚类连接,模块之间有明显的分隔。此外,还可以找到一些清晰的例子,展示了它如何在转录调控网络和其他网络(如 RNA 结合蛋白)之间架起桥梁。与图神经网络结合使用时,LatentDAG 在多种任务中的表现优于其他生物网络,包括基因保守性预测和基因聚类。
代码可在 https://github.com/gersteinlab/LatentDAG 上获得。