Suppr超能文献

通过集成特征选择和相似性测量实现有效的单细胞聚类。

Effective single-cell clustering through ensemble feature selection and similarity measurements.

作者信息

Jeong Hyundoo, Khunlertgit Navadon

机构信息

Department of Mechatronics Engineering, Incheon National University, Incheon 22012, Republic of Korea.

Optimization Theory and Applications for Engineering Systems Research Group, Department of Computer Engineering, Faculty of Engineering, Chiang Mai University, Chiang Mai 50200, Thailand; Biomedical Engineering Institute, Chiang Mai University, Chiang Mai 50200, Thailand.

出版信息

Comput Biol Chem. 2020 May 19;87:107283. doi: 10.1016/j.compbiolchem.2020.107283.

Abstract

Single-cell RNA sequencing technologies have revolutionized biomedical research by providing an effective means to profile gene expressions in individual cells. One of the first fundamental steps to perform the in-depth analysis of single-cell sequencing data is cell type classification and identification. Computational methods such as clustering algorithms have been utilized and gaining in popularity because they can save considerable resources and time for experimental validations. Although selecting the optimal features (i.e., genes) is an essential process to obtain accurate and reliable single-cell clustering results, the computational complexity and dropout events that can introduce zero-inflated noise make this process very challenging. In this paper, we propose an effective single-cell clustering algorithm based on the ensemble feature selection and similarity measurements. We initially identify the set of potential features, then measure the cell-to-cell similarity based on the subset of the potentials through multiple feature sampling approaches. We construct the ensemble network based on cell-to-cell similarity. Finally, we apply a network-based clustering algorithm to obtain single-cell clusters. We evaluate the performance of our proposed algorithm through multiple assessments in real-world single-cell RNA sequencing datasets with known cell types. The results show that our proposed algorithm can identify accurate and consistent single-cell clustering. Moreover, the proposed algorithm takes relative expression as input, so it can easily be adopted by existing analysis pipelines. The source code has been made publicly available at https://github.com/jeonglab/scCLUE.

摘要

单细胞RNA测序技术通过提供一种在单个细胞中分析基因表达的有效手段,彻底改变了生物医学研究。对单细胞测序数据进行深入分析的首要基本步骤之一是细胞类型分类和识别。诸如聚类算法之类的计算方法已被广泛应用且越来越受欢迎,因为它们可以为实验验证节省大量资源和时间。尽管选择最优特征(即基因)是获得准确可靠的单细胞聚类结果的关键步骤,但计算复杂性和可能引入零膨胀噪声的缺失事件使得这一过程极具挑战性。在本文中,我们提出了一种基于集成特征选择和相似性度量的有效单细胞聚类算法。我们首先识别潜在特征集,然后通过多种特征采样方法基于潜在特征子集测量细胞间的相似性。我们基于细胞间相似性构建集成网络。最后,我们应用基于网络的聚类算法来获得单细胞聚类。我们通过在具有已知细胞类型的真实单细胞RNA测序数据集中进行多次评估,来评估我们提出的算法的性能。结果表明,我们提出的算法能够识别准确且一致的单细胞聚类。此外,该算法以相对表达作为输入,因此可以很容易地被现有分析流程采用。源代码已在https://github.com/jeonglab/scCLUE上公开提供。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验