• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合 scATAC-seq 数据与基因组序列来破译细胞类型。

Deciphering cell types by integrating scATAC-seq data with genome sequences.

机构信息

School of Big Data and Software Engineering, Chongqing University, Chongqing, China.

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.

出版信息

Nat Comput Sci. 2024 Apr;4(4):285-298. doi: 10.1038/s43588-024-00622-7. Epub 2024 Apr 10.

DOI:10.1038/s43588-024-00622-7
PMID:38600256
Abstract

The single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) technology provides insight into gene regulation and epigenetic heterogeneity at single-cell resolution, but cell annotation from scATAC-seq remains challenging due to high dimensionality and extreme sparsity within the data. Existing cell annotation methods mostly focus on the cell peak matrix without fully utilizing the underlying genomic sequence. Here we propose a method, SANGO, for accurate single-cell annotation by integrating genome sequences around the accessibility peaks within scATAC data. The genome sequences of peaks are encoded into low-dimensional embeddings, and then iteratively used to reconstruct the peak statistics of cells through a fully connected network. The learned weights are considered as regulatory modes to represent cells, and utilized to align the query cells and the annotated cells in the reference data through a graph transformer network for cell annotations. SANGO was demonstrated to consistently outperform competing methods on 55 paired scATAC-seq datasets across samples, platforms and tissues. SANGO was also shown to be able to detect unknown tumor cells through attention edge weights learned by the graph transformer. Moreover, from the annotated cells, we found cell-type-specific peaks that provide functional insights/biological signals through expression enrichment analysis, cis-regulatory chromatin interaction analysis and motif enrichment analysis.

摘要

使用测序技术进行转座酶可及染色质的单细胞分析(scATAC-seq)技术可深入了解单细胞分辨率下的基因调控和表观遗传异质性,但由于数据的高维性和极度稀疏性,scATAC-seq 中的细胞注释仍然具有挑战性。现有的细胞注释方法主要集中在细胞峰矩阵上,而没有充分利用底层基因组序列。在这里,我们提出了一种方法 SANGO,通过整合 scATAC 数据中可及性峰周围的基因组序列,实现准确的单细胞注释。峰的基因组序列被编码为低维嵌入,然后通过全连接网络迭代用于通过重构细胞的峰统计信息。所学习到的权重被认为是表示细胞的调节模式,并通过图变换网络用于查询细胞和参考数据中注释的细胞的对齐,以进行细胞注释。SANGO 在 55 对跨样本、平台和组织的配对 scATAC-seq 数据集上的表现始终优于竞争方法。SANGO 还能够通过图变换学习到的注意力边权重来检测未知的肿瘤细胞。此外,从注释的细胞中,我们发现了通过表达富集分析、顺式调控染色质相互作用分析和基序富集分析提供功能见解/生物学信号的细胞类型特异性峰。

相似文献

1
Deciphering cell types by integrating scATAC-seq data with genome sequences.通过整合 scATAC-seq 数据与基因组序列来破译细胞类型。
Nat Comput Sci. 2024 Apr;4(4):285-298. doi: 10.1038/s43588-024-00622-7. Epub 2024 Apr 10.
2
Incorporating network diffusion and peak location information for better single-cell ATAC-seq data analysis.融合网络扩散和峰位置信息以改善单细胞 ATAC-seq 数据分析。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae093.
3
Application of Single-Cell Assay for Transposase-Accessible Chromatin with High Throughput Sequencing in Plant Science: Advances, Technical Challenges, and Prospects.单细胞分析在高通量测序中的应用:植物科学的进展、技术挑战和前景。
Int J Mol Sci. 2024 Jan 25;25(3):1479. doi: 10.3390/ijms25031479.
4
simATAC: a single-cell ATAC-seq simulation framework.simATAC:单细胞 ATAC-seq 模拟框架。
Genome Biol. 2021 Mar 4;22(1):74. doi: 10.1186/s13059-021-02270-w.
5
A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder.基于 ProdDep 转换器编码器的单细胞 ATAC-Seq 分析的统一深度学习框架。
Int J Mol Sci. 2023 Mar 1;24(5):4784. doi: 10.3390/ijms24054784.
6
scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks.scBasset:基于序列的单细胞 ATAC-seq 卷积神经网络建模。
Nat Methods. 2022 Sep;19(9):1088-1096. doi: 10.1038/s41592-022-01562-8. Epub 2022 Aug 8.
7
CAraCAl: CAMML with the integration of chromatin accessibility.CAraCAl:整合染色质可及性的 CAMML。
BMC Bioinformatics. 2024 Jun 13;25(1):212. doi: 10.1186/s12859-024-05833-3.
8
Systematic benchmarking of single-cell ATAC-sequencing protocols.单细胞 ATAC-seq 测序协议的系统基准测试。
Nat Biotechnol. 2024 Jun;42(6):916-926. doi: 10.1038/s41587-023-01881-x. Epub 2023 Aug 3.
9
Assessment of computational methods for the analysis of single-cell ATAC-seq data.单细胞 ATAC-seq 数据分析的计算方法评估。
Genome Biol. 2019 Nov 18;20(1):241. doi: 10.1186/s13059-019-1854-5.
10
Translator: A fer earning Approach to Facilitate Single-Cell AC-Seq Data Analysis frm eference Dataset.翻译:一种基于收益的方法,可从参考数据集促进单细胞 AC-Seq 数据分析。
J Comput Biol. 2022 Jul;29(7):619-633. doi: 10.1089/cmb.2021.0596. Epub 2022 May 17.

引用本文的文献

1
Cell-Type Annotation for scATAC-Seq Data by Integrating Chromatin Accessibility and Genome Sequence.通过整合染色质可及性和基因组序列对单细胞ATAC测序数据进行细胞类型注释
Biomolecules. 2025 Jun 27;15(7):938. doi: 10.3390/biom15070938.
2
Leveraging multiple labeled datasets for the automated annotation of single-cell RNA and ATAC data.利用多个标记数据集对单细胞RNA和ATAC数据进行自动注释。
Comput Struct Biotechnol J. 2025 Jul 1;27:2863-2870. doi: 10.1016/j.csbj.2025.06.043. eCollection 2025.
3
MINGLE: a mutual information-based interpretable framework for automatic cell type annotation in single-cell chromatin accessibility data.
MINGLE:一种基于互信息的可解释框架,用于单细胞染色质可及性数据中的自动细胞类型注释。
Genome Biol. 2025 Jun 11;26(1):162. doi: 10.1186/s13059-025-03603-9.
4
annATAC: automatic cell type annotation for scATAC-seq data based on language model.annATAC:基于语言模型的单细胞染色质可及性测序数据自动细胞类型注释
BMC Biol. 2025 May 28;23(1):145. doi: 10.1186/s12915-025-02244-5.
5
Graph neural networks for single-cell omics data: a review of approaches and applications.用于单细胞组学数据的图神经网络:方法与应用综述
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf109.
6
Topological identification and interpretation for single-cell epigenetic regulation elucidation in multi-tasks using scAGDE.使用scAGDE在多任务中对单细胞表观遗传调控进行拓扑识别与阐释
Nat Commun. 2025 Feb 16;16(1):1691. doi: 10.1038/s41467-025-57027-x.
7
Mechanisms and technologies in cancer epigenetics.癌症表观遗传学的机制与技术
Front Oncol. 2025 Jan 7;14:1513654. doi: 10.3389/fonc.2024.1513654. eCollection 2024.
8
MultiKano: an automatic cell type annotation tool for single-cell multi-omics data based on Kolmogorov-Arnold network and data augmentation.MultiKano:一种基于柯尔莫哥洛夫 - 阿诺德网络和数据增强的单细胞多组学数据自动细胞类型注释工具。
Protein Cell. 2025 May 28;16(5):374-380. doi: 10.1093/procel/pwae069.
9
A variational expectation-maximization framework for balanced multi-scale learning of protein and drug interactions.一种平衡多尺度学习蛋白质和药物相互作用的变分期望最大化框架。
Nat Commun. 2024 May 25;15(1):4476. doi: 10.1038/s41467-024-48801-4.