Liu Xinyu, Zhang Zhen, Tan Chao, Ai Yinquan, Liu Hao, Li Yuan, Yang Jin, Song Yongyan
Clinical Medical College & Affiliated Hospital & College of Basic Medicine, Chengdu University, Chengdu, 610081, China.
Hereditas. 2025 Aug 16;162(1):164. doi: 10.1186/s41065-025-00528-y.
BACKGROUND: Single-cell RNA sequencing (scRNA-seq) has revolutionized cellular heterogeneity analysis by decoding gene expression profiles at individual cell level, while machine learning (ML) has emerged as core computational tool for clustering analysis, dimensionality reduction modeling and developmental trajectory inference in single-cell transcriptomics(SCT). Although 3,307 papers have been published in past two decades, there remains lack of bibliometric review comprehensively addressing methodological evolution, technical challenges and clinical translation pathways. This study aims to fill research gap through bibliometric and visual analysis, revealing technological evolution trends and future development directions. METHODS: Using 3,307 publications from Web of Science Core Collection(WOSCC), we conducted bibliometric and visualization analysis through CiteSpace and VOSviewer to systematically review research trends, national/institutional contributions, keyword co-occurrence networks and co-citation relationships. Data screening strictly limited to English articles and reviews, excluding irrelevant document types, focusing on core application scenarios of ML in SCT. RESULTS: China and United States dominated research output (combined 65%), with China leading in publication volume (54.8%) while US demonstrating academic influence through H-index 84 and 37,135 total citations. Research hotspots concentrated on random forest (RF) and deep learning models, showing transition from algorithm development to clinical applications (e.g., tumor immune microenvironment analysis). Chinese Academy of Sciences and Harvard University emerged as core collaboration hubs, with international cooperation network primarily featuring US-China collaboration. Keyword clustering revealed four themes: gene expression, immunotherapy, bioinformatics, and inflammation-related research. Technical bottlenecks included data heterogeneity, insufficient model interpretability and weak cross-dataset generalization capability. CONCLUSION: ML-scRNA-seq integration has advanced cellular heterogeneity analysis and precision medicine development. Future directions should optimize deep learning architectures, enhance model generalization capabilities, and promote technical translation through multi-omics and clinical data integration. Interdisciplinary collaboration represents key to overcoming current limitations (e.g., data standardization, algorithm interpretability), ultimately realizing deep integration between single-cell technologies and precision medicine.
背景:单细胞RNA测序(scRNA-seq)通过在单个细胞水平解码基因表达谱,彻底改变了细胞异质性分析,而机器学习(ML)已成为单细胞转录组学(SCT)中聚类分析、降维建模和发育轨迹推断的核心计算工具。尽管在过去二十年中已发表了3307篇论文,但仍缺乏全面探讨方法演变、技术挑战和临床转化途径的文献计量学综述。本研究旨在通过文献计量学和可视化分析填补这一研究空白,揭示技术演变趋势和未来发展方向。 方法:利用来自科学引文索引核心合集(WOSCC)的3307篇出版物,我们通过CiteSpace和VOSviewer进行了文献计量学和可视化分析,以系统地回顾研究趋势、国家/机构贡献、关键词共现网络和共被引关系。数据筛选严格限于英文文章和综述,排除无关文献类型,重点关注ML在SCT中的核心应用场景。 结果:中国和美国主导了研究产出(合计65%),中国在发表数量上领先(54.8%),而美国通过H指数84和总被引次数37135展示了学术影响力。研究热点集中在随机森林(RF)和深度学习模型,显示出从算法开发到临床应用的转变(例如肿瘤免疫微环境分析)。中国科学院和哈佛大学成为核心合作中心,国际合作网络主要以中美合作为特色。关键词聚类揭示了四个主题:基因表达、免疫治疗、生物信息学和炎症相关研究。技术瓶颈包括数据异质性、模型可解释性不足和跨数据集泛化能力弱。 结论:ML-scRNA-seq整合推动了细胞异质性分析和精准医学发展。未来方向应优化深度学习架构,增强模型泛化能力,并通过多组学和临床数据整合促进技术转化。跨学科合作是克服当前局限性(如数据标准化、算法可解释性)的关键,最终实现单细胞技术与精准医学的深度整合。
Front Med (Lausanne). 2025-7-25
Front Oncol. 2025-6-30
Nat Genet. 2025-4
Biomolecules. 2024-6-27
Front Genet. 2024-1-11