Xie Aimin, Wang Hao, Zhao Jiaxu, Wang Zhaoyang, Xu Jinyuan, Xu Yan
College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 157 Baojian Road, Heilongjiang 150081, China.
Genetron Health (Beijing) Co. Ltd, 1-2/F, Building 11, Zone 1, 8 Life Science Parkway, Changping District, Beijing 102208, China.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae655.
Despite significant advancements in single-cell sequencing analysis for characterizing tissue sample heterogeneity, identifying the associations between cell subpopulations and disease phenotypes remains a challenging task. Here, we introduce scPAS, a new bioinformatics tool designed to integrate bulk data to identify phenotype-associated cell subpopulations within single-cell data. scPAS employs a network-regularized sparse regression model to quantify the association between each cell in single-cell data and a phenotype. Additionally, it estimates the significance of these associations through a permutation test, thereby identifying phenotype-associated cell subpopulations. Utilizing simulated data and various single-cell datasets from breast carcinoma, ovarian cancer, and atherosclerosis, as well as spatial transcriptomics data from multiple cancers, we demonstrated the accuracy, flexibility, and broad applicability of scPAS. Evaluations on large datasets revealed that scPAS exhibits superior operational efficiency compared to other methods. The open-source scPAS R package is available at GitHub website: https://github.com/aiminXie/scPAS.
尽管在表征组织样本异质性的单细胞测序分析方面取得了重大进展,但识别细胞亚群与疾病表型之间的关联仍然是一项具有挑战性的任务。在这里,我们介绍了scPAS,这是一种新的生物信息学工具,旨在整合批量数据以识别单细胞数据中与表型相关的细胞亚群。scPAS采用网络正则化稀疏回归模型来量化单细胞数据中每个细胞与表型之间的关联。此外,它通过置换检验估计这些关联的显著性,从而识别与表型相关的细胞亚群。利用模拟数据、来自乳腺癌、卵巢癌和动脉粥样硬化的各种单细胞数据集,以及来自多种癌症的空间转录组学数据,我们证明了scPAS的准确性、灵活性和广泛适用性。对大型数据集的评估表明,scPAS与其他方法相比具有更高的操作效率。开源的scPAS R包可在GitHub网站上获取:https://github.com/aiminXie/scPAS。