Suppr超能文献

一种使用单细胞转录组学对头颈部鳞状细胞癌(HNSCC)和人乳头瘤病毒(HPV)患者进行分类的深度学习方法。

A deep learning method for classification of HNSCC and HPV patients using single-cell transcriptomics.

作者信息

Jarwal Akanksha, Dhall Anjali, Arora Akanksha, Patiyal Sumeet, Srivastava Aman, Raghava Gajendra P S

机构信息

Department of Computational Biology, Indraprastha Institute of Information Technology, Delhi, India.

出版信息

Front Mol Biosci. 2024 May 30;11:1395721. doi: 10.3389/fmolb.2024.1395721. eCollection 2024.

Abstract

BACKGROUND

Head and Neck Squamous Cell Carcinoma (HNSCC) is the seventh most highly prevalent cancer type worldwide. Early detection of HNSCC is one of the important challenges in managing the treatment of the cancer patients. Existing techniques for detecting HNSCC are costly, expensive, and invasive in nature.

METHODS

In this study, we aimed to address this issue by developing classification models using machine learning and deep learning techniques, focusing on single-cell transcriptomics to distinguish between HNSCC and normal samples. Furthermore, we built models to classify HNSCC samples into HPV-positive (HPV+) and HPV-negative (HPV-) categories. In this study, we have used GSE181919 dataset, we have extracted 20 primary cancer (HNSCC) samples, and 9 normal tissues samples. The primary cancer samples contained 13 HPV- and 7 HPV+ samples. The models developed in this study have been trained on 80% of the dataset and validated on the remaining 20%. To develop an efficient model, we performed feature selection using mRMR method to shortlist a small number of genes from a plethora of genes. We also performed Gene Ontology (GO) enrichment analysis on the 100 shortlisted genes.

RESULTS

Artificial Neural Network based model trained on 100 genes outperformed the other classifiers with an AUROC of 0.91 for HNSCC classification for the validation set. The same algorithm achieved an AUROC of 0.83 for the classification of HPV+ and HPV- patients on the validation set. In GO enrichment analysis, it was found that most genes were involved in binding and catalytic activities.

CONCLUSION

A software package has been developed in Python which allows users to identify HNSCC in patients along with their HPV status. It is available at https://webs.iiitd.edu.in/raghava/hnscpred/.

摘要

背景

头颈部鳞状细胞癌(HNSCC)是全球第七大最常见的癌症类型。早期检测HNSCC是癌症患者治疗管理中的重要挑战之一。现有的检测HNSCC的技术成本高昂、价格昂贵且具有侵入性。

方法

在本研究中,我们旨在通过使用机器学习和深度学习技术开发分类模型来解决这个问题,重点是单细胞转录组学以区分HNSCC和正常样本。此外,我们构建了模型将HNSCC样本分类为HPV阳性(HPV+)和HPV阴性(HPV-)类别。在本研究中,我们使用了GSE181919数据集,提取了20个原发性癌(HNSCC)样本和9个正常组织样本。原发性癌样本包含13个HPV-和7个HPV+样本。本研究中开发的模型在80%的数据集中进行训练,并在其余20%上进行验证。为了开发一个高效的模型,我们使用mRMR方法进行特征选择,从大量基因中筛选出少量基因。我们还对100个入围基因进行了基因本体(GO)富集分析。

结果

基于人工神经网络的模型在100个基因上进行训练,在验证集上对HNSCC分类的AUROC为0.91,优于其他分类器。相同算法在验证集上对HPV+和HPV-患者分类的AUROC为0.83。在GO富集分析中,发现大多数基因参与结合和催化活性。

结论

已用Python开发了一个软件包,允许用户识别患者的HNSCC及其HPV状态。可在https://webs.iiitd.edu.in/raghava/hnscpred/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39b0/11169846/031fbf4bd4c9/fmolb-11-1395721-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验