Suppr超能文献

基于核化多视图有符号图学习的单细胞 RNA 测序数据分析。

Kernelized multiview signed graph learning for single-cell RNA sequencing data.

机构信息

Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA.

Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.

出版信息

BMC Bioinformatics. 2023 Apr 4;24(1):127. doi: 10.1186/s12859-023-05250-y.

Abstract

BACKGROUND

Characterizing the topology of gene regulatory networks (GRNs) is a fundamental problem in systems biology. The advent of single cell technologies has made it possible to construct GRNs at finer resolutions than bulk and microarray datasets. However, cellular heterogeneity and sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing GRNs. Additionally, most GRN reconstruction approaches estimate a single network for the entire data. This could cause potential loss of information when single cell datasets are generated from multiple treatment conditions/disease states.

RESULTS

To better characterize single cell GRNs under different but related conditions, we propose the joint estimation of multiple networks using multiple signed graph learning (scMSGL). The proposed method is based on recently developed graph signal processing (GSP) based graph learning, where GRNs and gene expressions are modeled as signed graphs and graph signals, respectively. scMSGL learns multiple GRNs by optimizing the total variation of gene expressions with respect to GRNs while ensuring that the learned GRNs are similar to each other through regularization with respect to a learned signed consensus graph. We further kernelize scMSGL with the kernel selected to suit the structure of single cell data.

CONCLUSIONS

scMSGL is shown to have superior performance over existing state of the art methods in GRN recovery on simulated datasets. Furthermore, scMSGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.

摘要

背景

基因调控网络(GRN)的拓扑结构特征是系统生物学的一个基本问题。单细胞技术的出现使得构建比批量和微阵列数据集更精细分辨率的 GRN 成为可能。然而,细胞异质性和单细胞数据集的稀疏性使得构建 GRN 时不能应用常规的高斯假设。此外,大多数 GRN 重建方法估计整个数据的单个网络。当单细胞数据集来自多个处理条件/疾病状态时,这可能会导致潜在的信息丢失。

结果

为了更好地描述不同但相关条件下的单细胞 GRN,我们提出了使用多个有符号图学习(scMSGL)联合估计多个网络的方法。所提出的方法基于最近开发的基于图信号处理(GSP)的图学习,其中 GRN 和基因表达分别建模为有符号图和图信号。scMSGL 通过优化基因表达相对于 GRN 的总变差来学习多个 GRN,同时通过相对于学习的有符号共识图的正则化来确保学习的 GRN 彼此相似。我们进一步通过选择适合单细胞数据结构的核函数对 scMSGL 进行核化。

结论

在模拟数据集上的 GRN 恢复方面,scMSGL 表现优于现有最先进的方法。此外,scMSGL 在小鼠胚胎干细胞分化研究和髓母细胞瘤的癌症临床研究中成功识别了已确立的调节剂。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4180/10071725/5772caa5ac44/12859_2023_5250_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验