基于 TF-IDF 的单细胞 RNA-seq 数据聚类方法。

Single cell RNA-seq data clustering using TF-IDF based methods.

机构信息

University of Connecticut, Storrs, 06269, CT, USA.

出版信息

BMC Genomics. 2018 Aug 13;19(Suppl 6):569. doi: 10.1186/s12864-018-4922-4.

DOI:10.1186/s12864-018-4922-4

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6101073/

Abstract

BACKGROUND

Single cell transcriptomics is critical for understanding cellular heterogeneity and identification of novel cell types. Leveraging the recent advances in single cell RNA sequencing (scRNA-Seq) technology requires novel unsupervised clustering algorithms that are robust to high levels of technical and biological noise and scale to datasets of millions of cells.

RESULTS

We present novel computational approaches for clustering scRNA-seq data based on the Term Frequency - Inverse Document Frequency (TF-IDF) transformation that has been successfully used in the field of text analysis.

CONCLUSIONS

Empirical experimental results show that TF-IDF methods consistently outperform commonly used scRNA-Seq clustering approaches.

摘要

背景

单细胞转录组学对于理解细胞异质性和新型细胞类型的鉴定至关重要。利用单细胞 RNA 测序（scRNA-Seq）技术的最新进展需要新的无监督聚类算法，这些算法需要具有较强的抗高水平技术和生物噪声的能力，并能够扩展到数百万个细胞的数据集。

结果

我们提出了基于词频-逆文档频率（TF-IDF）转换的 scRNA-seq 数据聚类的新计算方法，该方法已成功应用于文本分析领域。

结论

实验结果表明，TF-IDF 方法始终优于常用的 scRNA-Seq 聚类方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0c9/6101073/ed1938844fc4/12864_2018_4922_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验