• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将异质单细胞基因表达数据与个体水平协变量信息进行整合。

Harmonizing heterogeneous single-cell gene expression data with individual-level covariate information.

作者信息

Mu Yudi, Li Wei Vivian

机构信息

Department of Statistics, University of California, Riverside, Riverside, CA 92521, United States.

出版信息

Bioinform Adv. 2025 Aug 9;5(1):vbaf189. doi: 10.1093/bioadv/vbaf189. eCollection 2025.

DOI:10.1093/bioadv/vbaf189
PMID:40874236
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12380451/
Abstract

MOTIVATION

The growing availability of single-cell RNA sequencing (scRNA-seq) data highlights the necessity for robust integration methods to uncover both shared and unique cellular features across samples. These datasets often exhibit technical variations and biological differences, complicating integrative analyses. While numerous integration methods have been proposed, many fail to account for individual-level covariates or are limited to discrete variables.

RESULTS

To address these limitations, we propose scINSIGHT2, a generalized linear latent variable model that accommodates both continuous covariates, such as age, and discrete factors, such as disease conditions. Through both simulation studies and real-data applications, we demonstrate that scINSIGHT2 accurately harmonizes scRNA-seq datasets, whether from single or multiple sources. These results highlight scINSIGHT2's utility in capturing meaningful biological insights from scRNA-seq data while accounting for individual-level variation.

AVAILABILITY AND IMPLEMENTATION

The scINSIGHT2 method has been implemented as a R package, which is available at https://github.com/yudimu/scINSIGHT2/.

摘要

动机

单细胞RNA测序(scRNA-seq)数据的可得性不断增加,凸显了强大的整合方法对于揭示不同样本间共享和独特细胞特征的必要性。这些数据集常常表现出技术差异和生物学差异,使得整合分析变得复杂。虽然已经提出了许多整合方法,但许多方法未能考虑个体水平的协变量,或者仅限于离散变量。

结果

为解决这些局限性,我们提出了scINSIGHT2,这是一种广义线性潜在变量模型,它既能处理连续协变量(如年龄),也能处理离散因素(如疾病状况)。通过模拟研究和实际数据应用,我们证明scINSIGHT2能够准确地整合scRNA-seq数据集,无论其来自单一还是多个来源。这些结果凸显了scINSIGHT2在考虑个体水平变异的同时,从scRNA-seq数据中获取有意义生物学见解的效用。

可用性与实现

scINSIGHT2方法已作为一个R包实现,可在https://github.com/yudimu/scINSIGHT2/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/f68f61fef90e/vbaf189f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/28257087d0e6/vbaf189f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/4c678a6c1a2c/vbaf189f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/7a68aaa815a8/vbaf189f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/df4ca06d66d8/vbaf189f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/f68f61fef90e/vbaf189f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/28257087d0e6/vbaf189f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/4c678a6c1a2c/vbaf189f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/7a68aaa815a8/vbaf189f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/df4ca06d66d8/vbaf189f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4875/12380451/f68f61fef90e/vbaf189f5.jpg

相似文献

1
Harmonizing heterogeneous single-cell gene expression data with individual-level covariate information.将异质单细胞基因表达数据与个体水平协变量信息进行整合。
Bioinform Adv. 2025 Aug 9;5(1):vbaf189. doi: 10.1093/bioadv/vbaf189. eCollection 2025.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
MuDCoD: multi-subject community detection in personalized dynamic gene networks from single-cell RNA sequencing.MuDCoD:单细胞 RNA 测序中个性化动态基因网络的多主体社区检测。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad592.
4
Short-Term Memory Impairment短期记忆障碍
5
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
DiSC: a statistical tool for fast differential expression analysis of individual-level single-cell RNA-seq data.DiSC:一种用于个体水平单细胞RNA测序数据快速差异表达分析的统计工具。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf327.
8
stGNN: Spatially Informed Cell-Type Deconvolution Based on Deep Graph Learning and Statistical Modeling.stGNN:基于深度图学习和统计建模的空间信息细胞类型反卷积
Interdiscip Sci. 2025 Jun 26. doi: 10.1007/s12539-025-00728-0.
9
MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.马克 VCID 脑小血管联盟:一、入组、临床、液体方案。
Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.
10
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge.用于批量 RNA 测序数据的整体且稳健的去卷积方案,该方案整合了多个单细胞参考集和先验生物学知识。
Bioinformatics. 2022 Sep 30;38(19):4530-4536. doi: 10.1093/bioinformatics/btac563.

本文引用的文献

1
Integration of scRNA-seq data by disentangled representation learning with condition domain adaptation.基于条件域自适应的解缠表示学习整合 scRNA-seq 数据。
BMC Bioinformatics. 2024 Mar 16;25(1):116. doi: 10.1186/s12859-024-05706-9.
2
eSVD-DE: cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings.eSVD-DE:使用指数族嵌入进行单细胞 RNA-seq 数据的全队列差异表达分析。
BMC Bioinformatics. 2024 Mar 15;25(1):113. doi: 10.1186/s12859-024-05724-7.
3
scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data.
scDisInFact:用于多批次多条件单细胞 RNA-seq 数据的集成和预测的解缠学习。
Nat Commun. 2024 Jan 30;15(1):912. doi: 10.1038/s41467-024-45227-w.
4
A Review of Single-Cell RNA-Seq Annotation, Integration, and Cell-Cell Communication.单细胞 RNA-Seq 注释、整合和细胞间通讯综述。
Cells. 2023 Jul 30;12(15):1970. doi: 10.3390/cells12151970.
5
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays.广义矩阵分解:用于将广义线性潜在变量模型拟合到大型数据阵列的高效算法。
J Mach Learn Res. 2022 Nov;23.
6
Integration of Single-Cell RNA-Seq Datasets: A Review of Computational Methods.单细胞 RNA-Seq 数据集的整合:计算方法综述。
Mol Cells. 2023 Feb 28;46(2):106-119. doi: 10.14348/molcells.2023.0009. Epub 2023 Feb 24.
7
Author Correction: scINSIGHT for interpreting single-cell gene expression from biologically heterogeneous data.作者更正:用于从生物异质数据解释单细胞基因表达的scINSIGHT
Genome Biol. 2022 Apr 21;23(1):104. doi: 10.1186/s13059-022-02672-4.
8
Single-cell RNA sequencing technologies and applications: A brief overview.单细胞 RNA 测序技术及应用:简述。
Clin Transl Med. 2022 Mar;12(3):e694. doi: 10.1002/ctm2.694.
9
Fly Cell Atlas: A single-nucleus transcriptomic atlas of the adult fruit fly.果蝇细胞图谱:成年果蝇的单细胞转录组图谱。
Science. 2022 Mar 4;375(6584):eabk2432. doi: 10.1126/science.abk2432.
10
Single-cell multi-omics analysis of human pancreatic islets reveals novel cellular states in type 1 diabetes.单细胞多组学分析人类胰岛揭示 1 型糖尿病中的新型细胞状态。
Nat Metab. 2022 Feb;4(2):284-299. doi: 10.1038/s42255-022-00531-x. Epub 2022 Feb 28.