Duong Huong Thu, Huynh Nam Cong-Nhat, Nguyen Chi Thi-Kim, Le Linh Gia-Hoang, Nguyen Khoa Dang, Nguyen Hieu Trong, Tu Lan Ngoc-Ly, Tran Nam Huynh-Bao, Giang Hoa, Nguyen Hoai-Nghia, Ho Chuong Quoc, Hoang Hung Trong, Dang Thinh Huy-Quoc, Thai Tu Anh, Cao Dong Van
Faculty of Odonto-stomatology, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Viet Nam.
Center for Molecular Biomedicine, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Viet Nam.
J Dent Sci. 2024 Dec;19(Suppl 1):S81-S90. doi: 10.1016/j.jds.2024.08.013. Epub 2024 Aug 28.
BACKGROUND/PURPOSE: Oral squamous cell carcinoma (OSCC) is notorious for its low survival rates, due to the advanced stage at which it is commonly diagnosed. To enhance early detection and improve prognostic assessments, our study harnesses the power of machine learning (ML) to dissect and interpret complex patterns within mRNA-sequencing (RNA-seq) data and clinical-histopathological features.
206 retrospective Vietnamese OSCC formalin-fixed paraffin-embedded (FFPE) tumor samples, of which 101 were subjected to RNA-seq for classification based on gene expression. Then, learning models were built based on clinical-histopathological data to predict OSCC subtypes and propose potential biomarkers for the remaining 105 samples.
2 distinct groups of OSCC with different clinical-histopathological characteristics and gene expression. Subgroup 1 was characterized by severe histopathologic features with immune response and apoptosis signatures while subgroup 2 was denoted by more clinical/pathological features, cell division and malignant signatures. XGBoost and SVM (Support Vector Machine) models showed the best performance in predicting subtype OSCC. The study also proposed 12 candidate genes as potential biomarkers for OSCC subtypes (6/group).
The study identified characteristics of Vietnamese OSCC patients through a combination of mRNA sequencing and clinical-histopathological analysis. It contributes to the insight into the tumor microenvironment of OSCC and provides accurate ML models for biomarker prediction using clinical-histopathological features.
背景/目的:口腔鳞状细胞癌(OSCC)因其通常在晚期才被诊断出来,生存率较低而声名狼藉。为了加强早期检测并改善预后评估,我们的研究利用机器学习(ML)的力量来剖析和解读mRNA测序(RNA-seq)数据以及临床组织病理学特征中的复杂模式。
206例越南OSCC福尔马林固定石蜡包埋(FFPE)肿瘤样本,其中101例进行RNA-seq以基于基因表达进行分类。然后,基于临床组织病理学数据构建学习模型,以预测OSCC亚型并为其余105个样本提出潜在的生物标志物。
OSCC分为2个具有不同临床组织病理学特征和基因表达的不同组。亚组1的特征是具有免疫反应和凋亡特征的严重组织病理学特征,而亚组2则以更多临床/病理特征、细胞分裂和恶性特征为特征。XGBoost和支持向量机(SVM)模型在预测OSCC亚型方面表现最佳。该研究还提出了12个候选基因作为OSCC亚型的潜在生物标志物(每组6个)。
该研究通过结合mRNA测序和临床组织病理学分析,确定了越南OSCC患者的特征。它有助于深入了解OSCC的肿瘤微环境,并提供使用临床组织病理学特征进行生物标志物预测的准确ML模型。