Suppr超能文献

NG-SEM:一种有效的非高斯结构方程建模框架,用于从单细胞 RNA-seq 数据中推断基因调控网络。

NG-SEM: an effective non-Gaussian structural equation modeling framework for gene regulatory network inference from single-cell RNA-seq data.

机构信息

Department of Mathematics, The University of Hongkong, Pokfulam road, Hong Kong.

School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, ShaanXi, China.

出版信息

Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad369.

Abstract

Inference of gene regulatory network (GRN) from gene expression profiles has been a central problem in systems biology and bioinformatics in the past decades. The tremendous emergency of single-cell RNA sequencing (scRNA-seq) data brings new opportunities and challenges for GRN inference: the extensive dropouts and complicated noise structure may also degrade the performance of contemporary gene regulatory models. Thus, there is an urgent need to develop more accurate methods for gene regulatory network inference in single-cell data while considering the noise structure at the same time. In this paper, we extend the traditional structural equation modeling (SEM) framework by considering a flexible noise modeling strategy, namely we use the Gaussian mixtures to approximate the complex stochastic nature of a biological system, since the Gaussian mixture framework can be arguably served as a universal approximation for any continuous distributions. The proposed non-Gaussian SEM framework is called NG-SEM, which can be optimized by iteratively performing Expectation-Maximization algorithm and weighted least-squares method. Moreover, the Akaike Information Criteria is adopted to select the number of components of the Gaussian mixture. To probe the accuracy and stability of our proposed method, we design a comprehensive variate of control experiments to systematically investigate the performance of NG-SEM under various conditions, including simulations and real biological data sets. Results on synthetic data demonstrate that this strategy can improve the performance of traditional Gaussian SEM model and results on real biological data sets verify that NG-SEM outperforms other five state-of-the-art methods.

摘要

从基因表达谱推断基因调控网络 (GRN) 是过去几十年系统生物学和生物信息学的核心问题。单细胞 RNA 测序 (scRNA-seq) 数据的大量涌现为 GRN 推断带来了新的机遇和挑战:广泛的缺失和复杂的噪声结构也可能降低当代基因调控模型的性能。因此,迫切需要在考虑噪声结构的同时,开发更准确的单细胞数据基因调控网络推断方法。在本文中,我们通过考虑灵活的噪声建模策略来扩展传统的结构方程建模 (SEM) 框架,即我们使用高斯混合来近似生物系统的复杂随机性质,因为高斯混合框架可以被认为是任何连续分布的通用逼近。所提出的非高斯 SEM 框架称为 NG-SEM,可以通过迭代执行期望最大化算法和加权最小二乘法进行优化。此外,采用 Akaike 信息准则来选择高斯混合的分量数。为了探究我们提出的方法的准确性和稳定性,我们设计了一系列综合控制实验来系统地研究 NG-SEM 在各种条件下的性能,包括模拟和真实生物数据集。在合成数据上的结果表明,该策略可以提高传统高斯 SEM 模型的性能,在真实生物数据集上的结果验证了 NG-SEM 优于其他五种最先进的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验