Suppr超能文献

癌症生物标志物发现中的特征选择的三重和四重优化。

Triple and quadruple optimization for feature selection in cancer biomarker discovery.

机构信息

Institute of Biomedicine, School of Medicine, University of Eastern Finland, 70210 Kuopio, Finland.

Institute of Biomedicine, School of Medicine, University of Eastern Finland, 70210 Kuopio, Finland.

出版信息

J Biomed Inform. 2024 Nov;159:104736. doi: 10.1016/j.jbi.2024.104736. Epub 2024 Oct 11.

Abstract

The proliferation of omics data has advanced cancer biomarker discovery but often falls short in external validation, mainly due to a narrow focus on prediction accuracy that neglects clinical utility and validation feasibility. We introduce three- and four-objective optimization strategies based on genetic algorithms to identify clinically actionable biomarkers in omics studies, addressing classification tasks aimed at distinguishing hard-to-differentiate cancer subtypes beyond histological analysis alone. Our hypothesis is that by optimizing more than one characteristic of cancer biomarkers, we may identify biomarkers that will enhance their success in external validation. Our objectives are to: (i) assess the biomarker panel's accuracy using a machine learning (ML) framework; (ii) ensure the biomarkers exhibit significant fold-changes across subtypes, thereby boosting the success rate of PCR or immunohistochemistry validations; (iii) select a concise set of biomarkers to simplify the validation process and reduce clinical costs; and (iv) identify biomarkers crucial for predicting overall survival, which plays a significant role in determining the prognostic value of cancer subtypes. We implemented and applied triple and quadruple optimization algorithms to renal carcinoma gene expression data from TCGA. The study targets kidney cancer subtypes that are difficult to distinguish through histopathology methods. Selected RNA-seq biomarkers were assessed against the gold standard method, which relies solely on clinical information, and in external microarray-based validation datasets. Notably, these biomarkers achieved over 0.8 of accuracy in external validations and added significant value to survival predictions, outperforming the use of clinical data alone with a superior c-index. The provided tool also helps explore the trade-off between objectives, offering multiple solutions for clinical evaluation before proceeding to costly validation or clinical trials.

摘要

组学数据的大量涌现推动了癌症生物标志物的发现,但在外部验证方面往往效果不佳,主要原因是过于关注预测准确性,而忽略了临床实用性和验证可行性。我们引入了基于遗传算法的三目标和四目标优化策略,以识别组学研究中的临床可操作生物标志物,解决分类任务,旨在区分仅凭组织学分析难以区分的癌症亚型。我们的假设是,通过优化癌症生物标志物的多个特征,我们可以识别出能够提高其在外部验证中成功率的生物标志物。我们的目标是:(i) 使用机器学习(ML)框架评估生物标志物组合的准确性;(ii) 确保生物标志物在亚型之间表现出显著的倍数变化,从而提高 PCR 或免疫组织化学验证的成功率;(iii) 选择一组简洁的生物标志物,以简化验证过程并降低临床成本;以及 (iv) 识别对预测总体生存率至关重要的生物标志物,这在确定癌症亚型的预后价值方面起着重要作用。我们实施并应用了三重和四重优化算法,对 TCGA 的肾细胞癌基因表达数据进行了分析。该研究针对通过组织病理学方法难以区分的肾癌亚型。选定的 RNA-seq 生物标志物与仅依赖于临床信息的金标准方法以及外部基于微阵列的验证数据集进行了评估。值得注意的是,这些生物标志物在外部验证中的准确率超过 0.8,并且对生存预测具有重要价值,其 c 指数优于仅使用临床数据的情况。该工具还可以帮助探索目标之间的权衡,在进行昂贵的验证或临床试验之前,为临床评估提供多种解决方案。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验