利用基因表达数据和机器学习评估卵巢癌化疗反应

Evaluating Ovarian Cancer Chemotherapy Response Using Gene Expression Data and Machine Learning.

作者信息

Amniouel Soukaina, Yalamanchili Keertana, Sankararaman Sreenidhi, Jafri Mohsin Saleet

机构信息

School of System Biology, George Mason University, Fairfax, VA 22030, USA.

School of Engineering, Brown University, Providence, RI 02912, USA.

出版信息

BioMedInformatics. 2024 Jun;4(2):1396-1424. doi: 10.3390/biomedinformatics4020077. Epub 2024 May 22.

DOI:10.3390/biomedinformatics4020077

PMID:39149564

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11326537/

Abstract

BACKGROUND

Ovarian cancer (OC) is the most lethal gynecological cancer in the United States. Among the different types of OC, serous ovarian cancer (SOC) stands out as the most prevalent. Transcriptomics techniques generate extensive gene expression data, yet only a few of these genes are relevant to clinical diagnosis.

METHODS

Methods for feature selection (FS) address the challenges of high dimensionality in extensive datasets. This study proposes a computational framework that applies FS techniques to identify genes highly associated with platinum-based chemotherapy response on SOC patients. Using SOC datasets from the Gene Expression Omnibus (GEO) database, LASSO and varSelRF FS methods were employed. Machine learning classification algorithms such as random forest (RF) and support vector machine (SVM) were also used to evaluate the performance of the models.

RESULTS

The proposed framework has identified biomarkers panels with 9 and 10 genes that are highly correlated with platinum-paclitaxel and platinum-only response in SOC patients, respectively. The predictive models have been trained using the identified gene signatures and accuracy of above 90% was achieved.

CONCLUSIONS

In this study, we propose that applying multiple feature selection methods not only effectively reduces the number of identified biomarkers, enhancing their biological relevance, but also corroborates the efficacy of drug response prediction models in cancer treatment.

摘要

背景

卵巢癌（OC）是美国致死率最高的妇科癌症。在不同类型的OC中，浆液性卵巢癌（SOC）最为常见。转录组学技术可生成大量基因表达数据，但其中只有少数基因与临床诊断相关。

方法

特征选择（FS）方法应对了海量数据集中高维度的挑战。本研究提出了一个计算框架，该框架应用FS技术来识别与SOC患者铂类化疗反应高度相关的基因。使用来自基因表达综合数据库（GEO）的SOC数据集，采用了套索（LASSO）和变量选择随机森林（varSelRF）FS方法。还使用了随机森林（RF）和支持向量机（SVM）等机器学习分类算法来评估模型的性能。

结果

所提出的框架分别识别出了与SOC患者铂类 - 紫杉醇和单纯铂类反应高度相关的含9个和10个基因的生物标志物组。已使用所识别的基因特征训练了预测模型，准确率达到了90%以上。

结论

在本研究中，我们提出应用多种特征选择方法不仅能有效减少所识别生物标志物的数量，增强其生物学相关性，还能证实药物反应预测模型在癌症治疗中的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0ee/11326537/968e780fa6ea/nihms-2013972-f0001.jpg

相似文献

Evaluating Ovarian Cancer Chemotherapy Response Using Gene Expression Data and Machine Learning.利用基因表达数据和机器学习评估卵巢癌化疗反应

BioMedInformatics. 2024 Jun;4(2):1396-1424. doi: 10.3390/biomedinformatics4020077. Epub 2024 May 22.

A feature selection-based framework to identify biomarkers for cancer diagnosis: A focus on lung adenocarcinoma.基于特征选择的癌症诊断生物标志物识别框架：以肺腺癌为例。

PLoS One. 2022 Sep 6;17(9):e0269126. doi: 10.1371/journal.pone.0269126. eCollection 2022.

Establish of an Initial Platinum-Resistance Predictor in High-Grade Serous Ovarian Cancer Patients Regardless of Homologous Recombination Deficiency Status.建立一种用于高级别浆液性卵巢癌患者的初始铂耐药预测指标，无论其同源重组缺陷状态如何。

Front Oncol. 2022 Mar 18;12:847085. doi: 10.3389/fonc.2022.847085. eCollection 2022.

Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods.利用特征选择方法从电子健康记录中识别与预测酒精使用障碍相关的临床因素。

BMC Med Inform Decis Mak. 2022 Nov 23;22(1):304. doi: 10.1186/s12911-022-02051-w.

High-accuracy prediction of colorectal cancer chemotherapy efficacy using machine learning applied to gene expression data.运用机器学习方法对基因表达数据进行分析，实现对结直肠癌化疗疗效的高精度预测。

Front Physiol. 2024 Jan 18;14:1272206. doi: 10.3389/fphys.2023.1272206. eCollection 2023.

Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods.基于稳健机器学习-递归特征消除方法的基因表达数据的稳健生物标志物筛选。

Comput Biol Chem. 2022 Oct;100:107747. doi: 10.1016/j.compbiolchem.2022.107747. Epub 2022 Jul 29.

Identifying Explainable Machine Learning Models and a Novel SFRP2 Fibroblast Signature as Predictors for Precision Medicine in Ovarian Cancer.鉴定可解释的机器学习模型和新型 SFRP2 成纤维细胞特征作为卵巢癌精准医疗的预测因子。

Int J Mol Sci. 2023 Nov 29;24(23):16942. doi: 10.3390/ijms242316942.

High Expression Levels of AGGF1 and MFAP4 Predict Primary Platinum-Based Chemoresistance and are Associated with Adverse Prognosis in Patients with Serous Ovarian Cancer.AGGF1和MFAP4的高表达水平预示原发性铂类化疗耐药，并与浆液性卵巢癌患者的不良预后相关。

J Cancer. 2019 Jan 1;10(2):397-407. doi: 10.7150/jca.28127. eCollection 2019.

Machine learning-assisted analysis of epithelial mesenchymal transition pathway for prognostic stratification and immune infiltration assessment in ovarian cancer.机器学习辅助分析上皮间质转化通路在卵巢癌中的预后分层和免疫浸润评估。

Front Endocrinol (Lausanne). 2023 Jun 19;14:1196094. doi: 10.3389/fendo.2023.1196094. eCollection 2023.

A risk model of gene signatures for predicting platinum response and survival in ovarian cancer.一种用于预测卵巢癌铂类药物反应和生存的基因标志物风险模型。

J Ovarian Res. 2022 Mar 31;15(1):39. doi: 10.1186/s13048-022-00969-3.

引用本文的文献

Expression profiling of KRAS and NOXA genes as prospective biomarkers in ovarian carcinoma.KRAS和NOXA基因作为卵巢癌潜在生物标志物的表达谱分析。

Sci Rep. 2025 Sep 5;15(1):32370. doi: 10.1038/s41598-025-17650-6.

Ex vivo 3D micro-tumour testing platform for predicting clinical response to platinum-based therapy in patients with high-grade serous ovarian cancer.用于预测高级别浆液性卵巢癌患者对铂类疗法临床反应的体外3D微肿瘤检测平台

NPJ Precis Oncol. 2025 Aug 30;9(1):306. doi: 10.1038/s41698-025-01080-8.

Prediction of Early Diagnosis in Ovarian Cancer Patients Using Machine Learning Approaches with Boruta and Advanced Feature Selection.使用带有Boruta和高级特征选择的机器学习方法预测卵巢癌患者的早期诊断

Life (Basel). 2025 Apr 3;15(4):594. doi: 10.3390/life15040594.

Enhancing Personalized Chemotherapy for Ovarian Cancer: Integrating Gene Expression Data with Machine Learning.增强卵巢癌的个性化化疗：将基因表达数据与机器学习相结合。

Asian Pac J Cancer Prev. 2025 Mar 1;26(3):959-967. doi: 10.31557/APJCP.2025.26.3.959.

本文引用的文献

Front Physiol. 2024 Jan 18;14:1272206. doi: 10.3389/fphys.2023.1272206. eCollection 2023.

Label-Free Quantification Mass Spectrometry Identifies Protein Markers of Chemotherapy Response in High-Grade Serous Ovarian Cancer.无标记定量质谱法鉴定高级别浆液性卵巢癌化疗反应的蛋白质标志物。

Cancers (Basel). 2023 Apr 6;15(7):2172. doi: 10.3390/cancers15072172.

Heat shock protein A2 is a novel extracellular vesicle-associated protein.热休克蛋白 A2 是一种新型细胞外囊泡相关蛋白。

Sci Rep. 2023 Mar 23;13(1):4734. doi: 10.1038/s41598-023-31962-5.

TUBB2B facilitates progression of hepatocellular carcinoma by regulating cholesterol metabolism through targeting HNF4A/CYP27A1.TUBB2B 通过靶向 HNF4A/CYP27A1 调控胆固醇代谢促进肝细胞癌进展。

Cell Death Dis. 2023 Mar 6;14(3):179. doi: 10.1038/s41419-023-05687-2.

Identification of matrix-remodeling associated 5 as a possible molecular oncotarget of pancreatic cancer.鉴定基质重塑相关 5 作为胰腺癌的潜在分子肿瘤靶点。

Cell Death Dis. 2023 Feb 24;14(2):157. doi: 10.1038/s41419-023-05684-5.

Emerging role of non-coding RNAs in resistance to platinum-based anti-cancer agents in lung cancer.非编码RNA在肺癌对铂类抗癌药物耐药中的新作用

Front Pharmacol. 2023 Jan 26;14:1105484. doi: 10.3389/fphar.2023.1105484. eCollection 2023.

Predicting Prognosis and Platinum Resistance in Ovarian Cancer: Role of Immunohistochemistry Biomarkers.预测卵巢癌的预后和铂类耐药：免疫组织化学标志物的作用。

Int J Mol Sci. 2023 Jan 19;24(3):1973. doi: 10.3390/ijms24031973.

Proteomic Discovery of Plasma Protein Biomarkers and Development of Models Predicting Prognosis of High-Grade Serous Ovarian Carcinoma.血浆蛋白质生物标志物的蛋白质组学发现和预测高级别浆液性卵巢癌预后模型的建立。

Mol Cell Proteomics. 2023 Mar;22(3):100502. doi: 10.1016/j.mcpro.2023.100502. Epub 2023 Jan 17.

Exosomal Plasma Gelsolin Is an Immunosuppressive Mediator in the Ovarian Tumor Microenvironment and a Determinant of Chemoresistance.外泌体血浆脉缩蛋白是卵巢肿瘤微环境中的一种免疫抑制介质，也是化疗耐药性的决定因素。

Cells. 2022 Oct 20;11(20):3305. doi: 10.3390/cells11203305.

CDC20 is a novel biomarker for improved clinical predictions in epithelial ovarian cancer.细胞分裂周期蛋白20（CDC20）是一种新型生物标志物，可改善上皮性卵巢癌的临床预测。

Am J Cancer Res. 2022 Jul 15;12(7):3303-3317. eCollection 2022.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用基因表达数据和机器学习评估卵巢癌化疗反应

Evaluating Ovarian Cancer Chemotherapy Response Using Gene Expression Data and Machine Learning.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献