• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PreCanCell:一种用于从单细胞转录组预测癌细胞和非癌细胞的集成学习算法。

PreCanCell: An ensemble learning algorithm for predicting cancer and non-cancer cells from single-cell transcriptomes.

作者信息

Yang Tao, Yan Qiyu, Long Rongzhuo, Liu Zhixian, Wang Xiaosheng

机构信息

Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China.

Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, China.

出版信息

Comput Struct Biotechnol J. 2023 Jul 11;21:3604-3614. doi: 10.1016/j.csbj.2023.07.009. eCollection 2023.

DOI:10.1016/j.csbj.2023.07.009
PMID:37501705
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10371765/
Abstract

We propose PreCanCell, a novel algorithm for predicting malignant and non-malignant cells from single-cell transcriptomes. PreCanCell first identifies the differentially expressed genes (DEGs) between malignant and non-malignant cells commonly in five common cancer types-associated single-cell transcriptome datasets. The five common cancer types include renal cell carcinoma (RCC), head and neck squamous cell carcinoma (HNSCC), melanoma, lung adenocarcinoma (LUAD), and breast cancer (BC). With each of the five datasets as the training set and the DEGs as the features, a single cell is classified as malignant or non-malignant by -NN ( = 5). Finally, the single cell is determined as malignant or non-malignant by the majority vote of the five -NN classification results. We tested the predictive performance of PreCanCell in 19 single-cell datasets, and reported classification accuracy, sensitivity, specificity, balanced accuracy (the average of sensitivity and specificity) and the area under the receiver operating characteristic curve (AUROC). In all these datasets, PreCanCell achieved above 0.8 accuracy, sensitivity, specificity, balanced accuracy and AUROC. Finally, we compared the predictive performance of PreCanCell with that of seven other algorithms, including CHETAH, SciBet, SCINA, scmap-cell, scmap-cluster, SingleR, and ikarus. Compared to these algorithms, PreCanCell displays the advantages of higher accuracy and simpler implementation. We have developed an R package for the PreCanCell algorithm, which is available at https://github.com/WangX-Lab/PreCanCell.

摘要

我们提出了PreCanCell,这是一种用于从单细胞转录组预测恶性和非恶性细胞的新算法。PreCanCell首先在五个常见癌症类型相关的单细胞转录组数据集中,识别恶性和非恶性细胞之间的差异表达基因(DEG)。这五种常见癌症类型包括肾细胞癌(RCC)、头颈部鳞状细胞癌(HNSCC)、黑色素瘤、肺腺癌(LUAD)和乳腺癌(BC)。以五个数据集中的每一个作为训练集,以DEG作为特征,通过k-NN(k = 5)将单个细胞分类为恶性或非恶性。最后,通过五个k-NN分类结果的多数投票来确定单个细胞是恶性还是非恶性。我们在19个单细胞数据集中测试了PreCanCell的预测性能,并报告了分类准确率、敏感性、特异性、平衡准确率(敏感性和特异性的平均值)以及受试者工作特征曲线下面积(AUROC)。在所有这些数据集中,PreCanCell在准确率、敏感性、特异性、平衡准确率和AUROC方面均达到了0.8以上。最后,我们将PreCanCell的预测性能与其他七种算法进行了比较,包括CHETAH、SciBet、SCINA、scmap-cell、scmap-cluster、SingleR和ikarus。与这些算法相比,PreCanCell具有更高的准确率和更简单的实现方式的优势。我们已经为PreCanCell算法开发了一个R包,可在https://github.com/WangX-Lab/PreCanCell上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/2850570dbb20/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/7382e6812c3e/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/01ce9334e2e4/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/ffe0b75fc250/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/7cfdc5e1725a/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/8175b7c4fb8b/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/a9474f4c67bd/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/2850570dbb20/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/7382e6812c3e/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/01ce9334e2e4/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/ffe0b75fc250/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/7cfdc5e1725a/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/8175b7c4fb8b/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/a9474f4c67bd/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c1b/10371765/2850570dbb20/gr6.jpg

相似文献

1
PreCanCell: An ensemble learning algorithm for predicting cancer and non-cancer cells from single-cell transcriptomes.PreCanCell:一种用于从单细胞转录组预测癌细胞和非癌细胞的集成学习算法。
Comput Struct Biotechnol J. 2023 Jul 11;21:3604-3614. doi: 10.1016/j.csbj.2023.07.009. eCollection 2023.
2
scAnnoX: an R package integrating multiple public tools for single-cell annotation.scAnnoX:一个整合了多个用于单细胞注释的公共工具的 R 包。
PeerJ. 2024 Mar 28;12:e17184. doi: 10.7717/peerj.17184. eCollection 2024.
3
PreMSIm: An R package for predicting microsatellite instability from the expression profiling of a gene panel in cancer.PreMSIm:一个用于通过癌症中基因面板的表达谱预测微卫星不稳定性的R包。
Comput Struct Biotechnol J. 2020 Mar 19;18:668-675. doi: 10.1016/j.csbj.2020.03.007. eCollection 2020.
4
Histologic subtype classification of non-small cell lung cancer using PET/CT images.使用 PET/CT 图像对非小细胞肺癌进行组织学亚型分类。
Eur J Nucl Med Mol Imaging. 2021 Feb;48(2):350-360. doi: 10.1007/s00259-020-04771-5. Epub 2020 Aug 10.
5
Energy Efficiency of Inference Algorithms for Clinical Laboratory Data Sets: Green Artificial Intelligence Study.临床实验室数据集推断算法的能效:绿色人工智能研究。
J Med Internet Res. 2022 Jan 25;24(1):e28036. doi: 10.2196/28036.
6
Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.基于自动编码器的单细胞 RNA-seq 数据分析聚类集成。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):660. doi: 10.1186/s12859-019-3179-5.
7
Novel prognostic matrisome-related gene signature of head and neck squamous cell carcinoma.头颈部鳞状细胞癌新的预后基质组相关基因特征
Front Cell Dev Biol. 2022 Aug 23;10:884590. doi: 10.3389/fcell.2022.884590. eCollection 2022.
8
A model of seven immune checkpoint-related genes predicting overall survival for head and neck squamous cell carcinoma.一种基于七个免疫检查点相关基因的模型,可预测头颈部鳞状细胞癌的总生存期。
Eur Arch Otorhinolaryngol. 2021 Sep;278(9):3467-3477. doi: 10.1007/s00405-020-06540-4. Epub 2021 Jan 15.
9
Effects of immune inflammation in head and neck squamous cell carcinoma: Tumor microenvironment, drug resistance, and clinical outcomes.免疫炎症在头颈部鳞状细胞癌中的作用:肿瘤微环境、耐药性及临床结局
Front Genet. 2022 Dec 12;13:1085700. doi: 10.3389/fgene.2022.1085700. eCollection 2022.
10
CT-based transformer model for non-invasively predicting the Fuhrman nuclear grade of clear cell renal cell carcinoma.基于CT的变压器模型用于无创预测透明细胞肾细胞癌的富尔曼核分级
Front Oncol. 2022 Sep 28;12:961779. doi: 10.3389/fonc.2022.961779. eCollection 2022.

引用本文的文献

1
CanCellCap: robust cancer cell capture across tissue types on single-cell RNA-seq data by multi-domain learning.CanCellCap:通过多域学习在单细胞RNA测序数据上跨组织类型进行强大的癌细胞捕获。
BMC Biol. 2025 Jul 30;23(1):230. doi: 10.1186/s12915-025-02337-1.
2
scMalignantFinder distinguishes malignant cells in single-cell and spatial transcriptomics by leveraging cancer signatures.scMalignantFinder通过利用癌症特征在单细胞和空间转录组学中区分恶性细胞。
Commun Biol. 2025 Mar 27;8(1):504. doi: 10.1038/s42003-025-07942-y.
3
Inferring tumor purity using multi-omics data based on a uniform machine learning framework MoTP.

本文引用的文献

1
Identifying tumor cells at the single-cell level using machine learning.利用机器学习在单细胞水平上识别肿瘤细胞。
Genome Biol. 2022 May 30;23(1):123. doi: 10.1186/s13059-022-02683-1.
2
A single-cell and spatially resolved atlas of human breast cancers.人类乳腺癌的单细胞和空间分辨图谱。
Nat Genet. 2021 Sep;53(9):1334-1347. doi: 10.1038/s41588-021-00911-1. Epub 2021 Sep 6.
3
Single-cell transcriptomes reveal heterogeneity of high-grade serous ovarian carcinoma.单细胞转录组揭示高级别浆液性卵巢癌的异质性。
基于统一机器学习框架MoTP使用多组学数据推断肿瘤纯度。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf056.
Clin Transl Med. 2021 Aug;11(8):e500. doi: 10.1002/ctm2.500.
4
DITHER: an algorithm for Defining IntraTumor Heterogeneity based on EntRopy.DITHER:一种基于熵定义肿瘤内异质性的算法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab202.
5
Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。
Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.
6
Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes.从单细胞转录组中描绘人类肿瘤的拷贝数和克隆亚结构。
Nat Biotechnol. 2021 May;39(5):599-608. doi: 10.1038/s41587-020-00795-2. Epub 2021 Jan 18.
7
Single-cell dissection of intratumoral heterogeneity and lineage diversity in metastatic gastric adenocarcinoma.单细胞剖析转移性胃腺癌肿瘤内异质性和谱系多样性。
Nat Med. 2021 Jan;27(1):141-151. doi: 10.1038/s41591-020-1125-8. Epub 2021 Jan 4.
8
SciBet as a portable and fast single cell type identifier.SciBet 作为一种便携式、快速的单细胞类型标识符。
Nat Commun. 2020 Apr 14;11(1):1818. doi: 10.1038/s41467-020-15523-2.
9
CaSpER identifies and visualizes CNV events by integrative analysis of single-cell or bulk RNA-sequencing data.CaSpER 通过对单细胞或批量 RNA 测序数据的综合分析来识别和可视化 CNV 事件。
Nat Commun. 2020 Jan 3;11(1):89. doi: 10.1038/s41467-019-13779-x.
10
Context is everything: aneuploidy in cancer.背景至关重要:癌症中的非整倍体。
Nat Rev Genet. 2020 Jan;21(1):44-62. doi: 10.1038/s41576-019-0171-x. Epub 2019 Sep 23.