Charles Perkins Centre, The University of Sydney, Sydney, NSW 2006, Australia.
School of Mathematics and Statistics, The University of Sydney, Sydney, NSW 2006, Australia.
Bioinformatics. 2022 Oct 14;38(20):4745-4753. doi: 10.1093/bioinformatics/btac590.
With the recent surge of large-cohort scale single cell research, it is of critical importance that analytical methods can fully utilize the comprehensive characterization of cellular systems that single cell technologies produce to provide insights into samples from individuals. Currently, there is little consensus on the best ways to compress information from the complex data structures of these technologies to summary statistics that represent each sample (e.g. individuals).
Here, we present scFeatures, an approach that creates interpretable cellular and molecular representations of single-cell and spatial data at the sample level. We demonstrate that summarizing a broad collection of features at the sample level is both important for understanding underlying disease mechanisms in different experimental studies and for accurately classifying disease status of individuals.
scFeatures is publicly available as an R package at https://github.com/SydneyBioX/scFeatures. All data used in this study are publicly available with accession ID reported in the Section 2.
Supplementary data are available at Bioinformatics online.
随着最近大规模单细胞研究的涌现,分析方法能够充分利用单细胞技术产生的细胞系统的全面特征,从而深入了解个体样本,这一点至关重要。目前,对于如何将这些技术的复杂数据结构中的信息压缩为代表每个样本(例如个体)的汇总统计数据,几乎没有达成共识。
在这里,我们提出了 scFeatures,这是一种在样本水平上创建单细胞和空间数据的可解释细胞和分子表示的方法。我们证明,在样本水平上总结广泛的特征对于理解不同实验研究中的潜在疾病机制以及准确分类个体的疾病状态都很重要。
scFeatures 作为一个 R 包在 https://github.com/SydneyBioX/scFeatures 上公开提供。本研究中使用的所有数据都可公开获取,其访问 ID 报告在第 2 节中。
补充数据可在生物信息学在线获得。