Liu Guojun, Shi Yan, Huang Hongxu, Xiao Ningkun, Liu Chuncheng, Zhao Hongyu, Xing Yongqiang, Cai Lu
School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou 014000, China.
Inner Mongolia Key Laboratory of Life Health and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou 014000, China.
Biology (Basel). 2025 Apr 26;14(5):479. doi: 10.3390/biology14050479.
The groundbreaking development of scRNA-seq has significantly improved cellular resolution. However, accurate cell-type annotation remains a major challenge. Existing annotation tools are often limited by their reliance on reference datasets, the heterogeneity of marker genes, and subjective biases introduced through manual intervention, all of which impact annotation accuracy and reliability. To address these limitations, we developed FPCAM, a fully automated pulmonary fibrosis cell-type annotation model. Built on the R Shiny platform, FPCAM utilizes a matrix of up-regulated marker genes and a manually curated gene-cell association dictionary specific to pulmonary fibrosis. It achieves accurate and efficient cell-type annotation through similarity matrix construction and optimized matching algorithms. To evaluate its performance, we compared FPCAM with state-of-the-art annotation models, including SCSA, SingleR, and SciBet. The results showed that FPCAM and SCSA both achieved an accuracy of 89.7%, outperforming SingleR and SciBet. Furthermore, FPCAM demonstrated high accuracy in annotating the external validation dataset GSE135893, successfully identifying multiple cell subtypes. In summary, FPCAM provides an efficient, flexible, and accurate solution for cell-type identification and serves as a powerful tool for scRNA-seq research in pulmonary fibrosis and other related diseases.
单细胞RNA测序(scRNA-seq)的开创性发展显著提高了细胞分辨率。然而,准确的细胞类型注释仍然是一个重大挑战。现有的注释工具往往受到对参考数据集的依赖、标记基因的异质性以及人工干预引入的主观偏差的限制,所有这些都会影响注释的准确性和可靠性。为了解决这些限制,我们开发了FPCAM,这是一种全自动的肺纤维化细胞类型注释模型。FPCAM基于R Shiny平台构建,利用上调的标记基因矩阵和特定于肺纤维化的人工策划的基因-细胞关联字典。它通过相似性矩阵构建和优化的匹配算法实现准确、高效的细胞类型注释。为了评估其性能,我们将FPCAM与包括SCSA、SingleR和SciBet在内的最先进注释模型进行了比较。结果表明,FPCAM和SCSA的准确率均达到89.7%,优于SingleR和SciBet。此外,FPCAM在注释外部验证数据集GSE135893时表现出高准确性,成功识别了多种细胞亚型。总之,FPCAM为细胞类型识别提供了一种高效、灵活且准确的解决方案,并成为肺纤维化及其他相关疾病scRNA-seq研究的有力工具。