Suppr超能文献

社区检测算法在大型生物数据集上的应用。

Applications of Community Detection Algorithms to Large Biological Datasets.

机构信息

BIU, Department of Bioengineering, Bar-Ilan University, Ramat Gan, Israel.

出版信息

Methods Mol Biol. 2021;2243:59-80. doi: 10.1007/978-1-0716-1103-6_3.

Abstract

Recent advances in data acquiring technologies in biology have led to major challenges in mining relevant information from large datasets. For example, single-cell RNA sequencing technologies are producing expression and sequence information from tens of thousands of cells in every single experiment. A common task in analyzing biological data is to cluster samples or features (e.g., genes) into groups sharing common characteristics. This is an NP-hard problem for which numerous heuristic algorithms have been developed. However, in many cases, the clusters created by these algorithms do not reflect biological reality. To overcome this, a Networks Based Clustering (NBC) approach was recently proposed, by which the samples or genes in the dataset are first mapped to a network and then community detection (CD) algorithms are used to identify clusters of nodes.Here, we created an open and flexible python-based toolkit for NBC that enables easy and accessible network construction and community detection. We then tested the applicability of NBC for identifying clusters of cells or genes from previously published large-scale single-cell and bulk RNA-seq datasets.We show that NBC can be used to accurately and efficiently analyze large-scale datasets of RNA sequencing experiments.

摘要

生物学中数据获取技术的最新进展给从大型数据集挖掘相关信息带来了重大挑战。例如,单细胞 RNA 测序技术在每个实验中都能从数以万计的细胞中获得表达和序列信息。分析生物数据的常见任务是将样本或特征(例如基因)聚类成具有共同特征的组。对于这个 NP 难问题,已经开发了许多启发式算法。然而,在许多情况下,这些算法创建的聚类并不反映生物现实。为了克服这个问题,最近提出了一种基于网络的聚类(NBC)方法,通过该方法,数据集的样本或基因首先被映射到网络上,然后使用社区检测(CD)算法来识别节点的聚类。在这里,我们创建了一个基于 Python 的开放且灵活的 NBC 工具包,使网络构建和社区检测变得简单且易于访问。然后,我们测试了 NBC 用于从先前发表的大规模单细胞和批量 RNA-seq 数据集中识别细胞或基因聚类的适用性。我们表明,NBC 可用于准确有效地分析大规模 RNA 测序实验数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验