Suppr超能文献

使用gPRINT在异质数据集中基于基因印记的人类疾病细胞亚型注释

Gene print-based cell subtypes annotation of human disease across heterogeneous datasets with gPRINT.

作者信息

Yan Ruojin, Fan Chunmei, Gu Shen, Wang Tingzhang, Yin Zi, Chen Xiao

机构信息

Department of Orthopedic Surgery of Sir Run Run Shaw Hospital, and Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou 310011, China.

Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, Department of Clinical Medicine, School of Medicine, Hangzhou City University, Hangzhou 310015, China.

出版信息

Protein Cell. 2025 Aug 7;16(8):685-704. doi: 10.1093/procel/pwaf001.

Abstract

Identification of disease-specific cell subtypes (DSCSs) has profound implications for understanding disease mechanisms, preoperative diagnosis, and precision therapy. However, achieving unified annotation of DSCSs in heterogeneous single-cell datasets remains a challenge. In this study, we developed the gPRINT algorithm (generalized approach for cell subtype identification with single cell's voicePRINT). Inspired by the principles of speech recognition in noisy environments, gPRINT transforms gene position and gene expression information into voiceprints based on ordered and clustered gene expression phenomena, obtaining unique "gene print" patterns for each cell. Then, we integrated neural networks to mitigate the impact of background noise on cell identity label mapping. We demonstrated the reproducibility of gPRINT across different donors, single-cell sequencing platforms, and disease subtypes, and its utility for automatic cell subtype annotation across datasets. Moreover, gPRINT achieved higher annotation accuracy of 98.37% when externally validated based on the same tissue, surpassing other algorithms. Furthermore, this approach has been applied to fibrosis-associated diseases in multiple tissues throughout the body, as well as to the annotation of fibroblast subtypes in a single tissue, tendon, where fibrosis is prevalent. We successfully achieved automatic prediction of tendinopathy-specific cell subtypes, key targets, and related drugs. In summary, gPRINT provides an automated and unified approach for identifying DSCSs across datasets, facilitating the elucidation of specific cell subtypes under different disease states and providing a powerful tool for exploring therapeutic targets in diseases.

摘要

识别疾病特异性细胞亚型(DSCSs)对于理解疾病机制、术前诊断和精准治疗具有深远意义。然而,在异质单细胞数据集中实现DSCSs的统一注释仍然是一项挑战。在本研究中,我们开发了gPRINT算法(基于单细胞声纹识别细胞亚型的通用方法)。受嘈杂环境中语音识别原理的启发,gPRINT基于有序和聚类的基因表达现象将基因位置和基因表达信息转化为声纹,为每个细胞获得独特的“基因指纹”模式。然后,我们整合神经网络以减轻背景噪声对细胞身份标签映射的影响。我们证明了gPRINT在不同供体、单细胞测序平台和疾病亚型中的可重复性,以及它在跨数据集自动细胞亚型注释中的实用性。此外,基于相同组织进行外部验证时,gPRINT的注释准确率达到了更高的98.37%,超过了其他算法。此外,这种方法已应用于全身多个组织中的纤维化相关疾病,以及纤维化普遍存在的单一组织——肌腱中成纤维细胞亚型的注释。我们成功实现了肌腱病特异性细胞亚型、关键靶点和相关药物的自动预测。总之,gPRINT为跨数据集识别DSCSs提供了一种自动化和统一的方法,有助于阐明不同疾病状态下的特定细胞亚型,并为探索疾病治疗靶点提供了有力工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ab9/12342163/fabfcd5eb2d1/pwaf001_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验