Vaeda 对单细胞 RNA 测序数据中的二联体进行了计算注释。

Vaeda computationally annotates doublets in single-cell RNA sequencing data.

机构信息

Department of Developmental Biology, University of Pittsburgh, Pittsburgh, PA 15201, USA.

Canegie Mellon-University of Pittsburgh Joint PhD Program, University of Pittsburgh, Pittsburgh, PA 15201, USA.

出版信息

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac720.

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) continues to expand our knowledge by facilitating the study of transcriptional heterogeneity at the level of single cells. Despite this technology's utility and success in biomedical research, technical artifacts are present in scRNA-seq data. Doublets/multiplets are a type of artifact that occurs when two or more cells are tagged by the same barcode, and therefore they appear as a single cell. Because this introduces non-existent transcriptional profiles, doublets can bias and mislead downstream analysis. To address this limitation, computational methods to annotate and remove doublets form scRNA-seq datasets are needed.

RESULTS

We introduce vaeda (Variational Auto-Encoder for Doublet Annotation), a new approach for computational annotation of doublets in scRNA-seq data. Vaeda integrates a variational auto-encoder and Positive-Unlabeled learning to produce doublet scores and binary doublet calls. We apply vaeda, along with seven existing doublet annotation methods, to 16 benchmark datasets and find that vaeda performs competitively in terms of doublet scores and doublet calls. Notably, vaeda outperforms other python-based methods for doublet annotation. Altogether, vaeda is a robust and competitive method for scRNA-seq doublet annotation and may be of particular interest in the context of python-based workflows.

AVAILABILITY AND IMPLEMENTATION

Vaeda is available at https://github.com/kostkalab/vaeda, and the version used for the results we present here is archived at zenodo (https://doi.org/10.5281/zenodo.7199783).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞 RNA 测序(scRNA-seq)通过促进单细胞水平转录异质性的研究,不断扩展我们的知识。尽管这项技术在生物医学研究中具有实用性和成功性,但 scRNA-seq 数据中存在技术伪影。双细胞/多细胞是一种伪影,当两个或多个细胞被相同的条形码标记时就会发生这种情况,因此它们看起来像是一个单细胞。因为这引入了不存在的转录谱,所以双细胞会影响和误导下游分析。为了解决这个限制,需要计算方法来注释和去除 scRNA-seq 数据集中的双细胞。

结果

我们引入了 vaeda(用于双细胞注释的变分自动编码器),这是一种用于 scRNA-seq 数据中双细胞计算注释的新方法。Vaeda 集成了变分自动编码器和正无标签学习,以产生双细胞分数和二进制双细胞调用。我们将 vaeda 与七种现有的双细胞注释方法一起应用于 16 个基准数据集,发现 vaeda 在双细胞分数和双细胞调用方面表现具有竞争力。值得注意的是,vaeda 在双细胞注释方面优于其他基于 python 的方法。总之,vaeda 是一种用于 scRNA-seq 双细胞注释的强大且具有竞争力的方法,特别是在基于 python 的工作流程中可能会引起关注。

可用性和实现

Vaeda 可在 https://github.com/kostkalab/vaeda 获得,我们在此处呈现的结果所使用的版本已在 zenodo(https://doi.org/10.5281/zenodo.7199783)存档。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e16/9805559/bd9488f97a89/btac720f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索