Suppr超能文献

使用Ambimux对单细胞RNA+ATAC多组学实验进行集成环境建模和基因解复用。

Integrated ambient modeling and genetic demultiplexing of single-cell RNA+ATAC multiome experiments with Ambimux.

作者信息

Alvarez Marcus, Li Terence, Lee Seung Hyuk T, Arasu Uma Thanigai, Selvarajan Ilakya, Örd Tiit, Rahmani Elior, Chen Zeyuan Johnson, Avram Oren, Kar Asha, Kaminska Dorota, Männistö Ville, Halperin Eran, Pihlajamäki Jussi, Luo Chongyuan, Kaikkonen Minna U, Zaitlen Noah, Pajukanta Päivi

机构信息

Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.

A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland.

出版信息

bioRxiv. 2025 Aug 26:2025.08.21.671671. doi: 10.1101/2025.08.21.671671.

Abstract

Single cell technologies have advanced at a rapid pace, providing assays for various molecular phenotypes. Droplet-based single cell technologies, particularly those based on nuclei isolation, such as simultaneous RNA+ATAC single-cell multiome, are susceptible to exogenous ambient molecule contamination, which can increase noise in cell type-level associations. We reasoned that genotype-based sample multiplexing can provide an opportunity to infer this ambient contamination by leveraging DNA variation in sequenced reads. Thus, we developed ambimux, a likelihood-based method to estimate ambient fractions and demultiplex single-cell multiome experiments using genotype-level data. Ambimux models the ambient or nuclear probability at the read level and thus can classify empty droplets and estimate droplet-specific ambient molecule fractions in each modality. We first evaluated our method using simulated data sets across a range of parameters. We found that ambimux closely estimated the ground truth droplet contamination fractions in the RNA (MAE=0.048) and ATAC (MAE=0.042) modalities. As a result, ambimux maintained high specificity (>95%) and was able to correctly assign singlets at considerably high ambient fractions (up to 60%) for both RNA and ATAC modalities. In comparison with models that do not consider ambient contamination, these only maintained similar sensitivity levels at considerably lower ambient fractions (up to 25%). We then generated a real data set of seven visceral adipose tissue biopsies run on a single 10x Multiome channel. We ran ambimux and detected 4,986 singlets, capturing similar numbers as other methods. Then, we sought to evaluate the fidelity of the ambient fraction estimates from ambimux. We split singlets into ambient-enriched (>5% contamination in both modalities) or nuclear-enriched (<5% in both) droplets and performed gene-peak linkage analysis. Low ambient droplets resulted in more significant hits with gene-peak links enriched at the transcription start site relative to high ambient droplets, suggesting that the ambient droplets identified by ambimux hamper the identification of biologically meaningful signals. In summary, we developed a joint single-cell multiome demultiplexing method, ambimux, that accurately models and estimates ambient molecule contamination in each modality.

摘要

单细胞技术发展迅速,为各种分子表型提供了检测方法。基于微滴的单细胞技术,特别是那些基于细胞核分离的技术,如同时进行RNA+ATAC单细胞多组学分析,容易受到外源性环境分子污染,这会增加细胞类型水平关联中的噪声。我们推断,基于基因型的样本复用可以通过利用测序读数中的DNA变异来推断这种环境污染。因此,我们开发了ambimux,这是一种基于似然性的方法,用于使用基因型水平数据估计环境分数并对单细胞多组学实验进行解复用。Ambimux在读取水平上对环境或细胞核概率进行建模,因此可以对空微滴进行分类,并估计每种模式下微滴特异性的环境分子分数。我们首先使用一系列参数的模拟数据集评估了我们的方法。我们发现ambimux在RNA(平均绝对误差=0.048)和ATAC(平均绝对误差=0.042)模式下能够紧密估计真实的微滴污染分数。结果,ambimux保持了高特异性(>95%)并且能够在相当高的环境分数(高达60%)下正确分配单细胞,无论是RNA还是ATAC模式。与不考虑环境污染的模型相比,这些模型仅在相当低的环境分数(高达25%)下保持相似的灵敏度水平。然后,我们生成了一个在单个10x多组学通道上运行的七个内脏脂肪组织活检的真实数据集。我们运行了ambimux并检测到4986个单细胞,捕获的数量与其他方法相似。然后,我们试图评估来自ambimux的环境分数估计的保真度。我们将单细胞分为环境富集(两种模式下污染>5%)或细胞核富集(两种模式下<5%)的微滴,并进行基因-峰关联分析。低环境微滴导致更多显著的命中,相对于高环境微滴,基因-峰链接在转录起始位点富集,这表明ambimux识别出的环境微滴会妨碍生物学上有意义信号的识别。总之,我们开发了一种联合单细胞多组学解复用方法ambimux,它能够准确地对每种模式下的环境分子污染进行建模和估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71f6/12407767/959f5a7d4761/nihpp-2025.08.21.671671v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验