Suppr超能文献

I-Impute:一种用于单细胞 RNA 测序数据插补的自洽方法。

I-Impute: a self-consistent method to impute single cell RNA sequencing data.

机构信息

School of Software, Northwestern Polytechnical University, Xi'an, Shaanxi, 710072, China.

Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China.

出版信息

BMC Genomics. 2020 Nov 18;21(Suppl 10):618. doi: 10.1186/s12864-020-07007-w.

Abstract

BACKGROUND

Single-cell RNA-sequencing (scRNA-seq) is becoming indispensable in the study of cell-specific transcriptomes. However, in scRNA-seq techniques, only a small fraction of the genes are captured due to "dropout" events. These dropout events require intensive treatment when analyzing scRNA-seq data. For example, imputation tools have been proposed to estimate dropout events and de-noise data. The performance of these imputation tools are often evaluated, or fine-tuned, using various clustering criteria based on ground-truth cell subgroup labels. This limits their effectiveness in the cases where we lack cell subgroup knowledge. We consider an alternative strategy which requires the imputation to follow a "self-consistency" principle; that is, the imputation process is to refine its results until there is no internal inconsistency or dropouts from the data.

RESULTS

We propose the use of "self-consistency" as a main criteria in performing imputation. To demonstrate this principle we devised I-Impute, a "self-consistent" method, to impute scRNA-seq data. I-Impute optimizes continuous similarities and dropout probabilities, in iterative refinements until a self-consistent imputation is reached. On the in silico data sets, I-Impute exhibited the highest Pearson correlations for different dropout rates consistently compared with the state-of-art methods SAVER and scImpute. Furthermore, we collected three wetlab datasets, mouse bladder cells dataset, embryonic stem cells dataset, and aortic leukocyte cells dataset, to evaluate the tools. I-Impute exhibited feasible cell subpopulation discovery efficacy on all the three datasets. It achieves the highest clustering accuracy compared with SAVER and scImpute.

CONCLUSIONS

A strategy based on "self-consistency", captured through our method, I-Impute, gave imputation results better than the state-of-the-art tools. Source code of I-Impute can be accessed at https://github.com/xikanfeng2/I-Impute .

摘要

背景

单细胞 RNA 测序(scRNA-seq)在研究细胞特异性转录组方面变得不可或缺。然而,在 scRNA-seq 技术中,由于“缺失”事件,只有一小部分基因被捕获。这些缺失事件在分析 scRNA-seq 数据时需要进行密集处理。例如,已经提出了插补工具来估计缺失事件并对数据进行去噪。这些插补工具的性能通常使用基于真实细胞亚群标签的各种聚类标准来评估或微调。这限制了它们在缺乏细胞亚群知识的情况下的有效性。我们考虑了一种替代策略,该策略要求插补遵循“自洽性”原则;也就是说,插补过程是要细化其结果,直到数据中没有内部不一致或缺失。

结果

我们提出将“自洽性”用作执行插补的主要标准。为了证明这一原理,我们设计了 I-Impute,一种“自洽”的方法,用于插补 scRNA-seq 数据。I-Impute 通过连续的相似性和缺失概率的迭代细化来优化,直到达到自洽的插补。在模拟数据集中,与最先进的方法 SAVER 和 scImpute 相比,I-Impute 在不同缺失率下表现出了最高的 Pearson 相关性。此外,我们收集了三个湿实验室数据集,即小鼠膀胱细胞数据集、胚胎干细胞数据集和主动脉白细胞细胞数据集,以评估这些工具。I-Impute 在所有三个数据集上都表现出了可行的细胞亚群发现效果。与 SAVER 和 scImpute 相比,它实现了最高的聚类准确性。

结论

通过我们的方法 I-Impute 捕获的基于“自洽性”的策略,给出的插补结果优于最先进的工具。I-Impute 的源代码可以在 https://github.com/xikanfeng2/I-Impute 上访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de88/7677776/0d431794ccd6/12864_2020_7007_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验