通过基因表达数据的组合分析预测乳腺癌预后

Breast cancer prognosis by combinatorial analysis of gene expression data.

作者信息

Alexe Gabriela, Alexe Sorin, Axelrod David E, Bonates Tibérius O, Lozina Irina I, Reiss Michael, Hammer Peter L

机构信息

RUTCOR (Rutgers University Center for Operations Research), Piscataway, New Jersey, USA.

出版信息

Breast Cancer Res. 2006;8(4):R41. doi: 10.1186/bcr1512.

DOI:10.1186/bcr1512

PMID:16859500

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1779471/

Abstract

INTRODUCTION

The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van 't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases, and other factors.

METHOD

Data were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines.

RESULTS

LAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of 'patterns' or 'combinatorial biomarkers' (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van 't Veer have differing characteristics.

CONCLUSION

The study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized explanation of the reasons for that prognosis for each patient). Moreover, the LAD model provides valuable insights into the roles of individual and combinatorial biomarkers, allows the discovery of new classes of patients, and generates a vast library of biomedical research hypotheses.

摘要

引言

数据分析工具在微阵列数据诊断和预后评估中的应用潜力，在范特维尔及其同事近期的乳腺癌数据集中得到了体现。我们使用新颖的数据逻辑分析（LAD）技术重新审视该数据集，目的有二：一是发现预后良好或不良病例的特征模式，用于准确且合理的预测；二是获取有关基因作用、特殊病例类别存在情况及其他因素的新信息。

方法

使用基于组合数学和优化的LAD方法对数据进行分析，该方法近期已证明在心脏病学、癌症蛋白质组学、血液学、肺病学及其他学科中能提供高度准确的诊断和预后评估系统。

结果

LAD从25000个基因中识别出17个基因的子集，能够完全区分预后不良和预后良好的患者。生成了一份详尽的“模式”或“组合生物标志物”（即基因组合及其表达水平限制）列表，并用40种模式创建了一个预后评估系统，该系统在训练集和测试集上的加权准确率分别为100%和92.9%。该预后评估系统使用的基因比其他方法少，且准确率与其他研究报告的相近或更高。在LAD识别出的17个基因中，有3个（分别为5个）被证明在决定预后不良（分别为预后良好）方面起重要作用。发现了两类新的患者（由相似的覆盖模式集、基因表达范围和临床特征描述）。作为该研究的一个副产品，结果表明范特维尔的训练集和测试集具有不同特征。

结论

该研究表明，LAD利用基因组数据为乳腺癌提供了一个准确且具有充分解释力的预后评估系统（即一个除了预测预后好坏外，还能为每个患者的预后原因提供个性化解释的系统）。此外，LAD模型为个体和组合生物标志物的作用提供了有价值的见解，有助于发现新的患者类别，并生成大量生物医学研究假设。

相似文献

Breast cancer prognosis by combinatorial analysis of gene expression data.

Breast Cancer Res. 2006;8(4):R41. doi: 10.1186/bcr1512.

Logical analysis of diffuse large B-cell lymphomas.

Artif Intell Med. 2005 Jul;34(3):235-67. doi: 10.1016/j.artmed.2004.11.004.

Mixture classification model based on clinical markers for breast cancer prognosis.

Artif Intell Med. 2010 Feb-Mar;48(2-3):129-37. doi: 10.1016/j.artmed.2009.07.008. Epub 2009 Dec 14.

A clinical prognostic prediction of lymph node-negative breast cancer by gene expression profiles.

J Cancer Res Clin Oncol. 2006 Sep;132(9):579-87. doi: 10.1007/s00432-006-0108-6. Epub 2006 Jun 8.

Pseudogene-gene functional networks are prognostic of patient survival in breast cancer.

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):51. doi: 10.1186/s12920-020-0687-0.

Gene expression profiles of breast cancer obtained from core cut biopsies before neoadjuvant docetaxel, adriamycin, and cyclophoshamide chemotherapy correlate with routine prognostic markers and could be used to identify predictive signatures.

Zentralbl Gynakol. 2006 Apr;128(2):76-81. doi: 10.1055/s-2006-921508.

Gene expression variation to predict 10-year survival in lymph-node-negative breast cancer.

BMC Cancer. 2008 Sep 8;8:254. doi: 10.1186/1471-2407-8-254.

Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

BMC Bioinformatics. 2008 Jan 28;9:63. doi: 10.1186/1471-2105-9-63.

Novel secretome-to-transcriptome integrated or secreto-transcriptomic approach to reveal liquid biopsy biomarkers for predicting individualized prognosis of breast cancer patients.

BMC Med Genomics. 2019 May 30;12(1):78. doi: 10.1186/s12920-019-0530-7.

The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set.

PLoS One. 2015 Jul 1;10(7):e0129711. doi: 10.1371/journal.pone.0129711. eCollection 2015.

引用本文的文献

A Hybrid Machine Learning Approach to Screen Optimal Predictors for the Classification of Primary Breast Tumors from Gene Expression Microarray Data.

Diagnostics (Basel). 2023 Feb 13;13(4):708. doi: 10.3390/diagnostics13040708.

The Breast Cancer Stem Cells Traits and Drug Resistance.

Front Pharmacol. 2021 Jan 28;11:599965. doi: 10.3389/fphar.2020.599965. eCollection 2020.

A Classification Model to Predict the Rate of Decline of Kidney Function.

Front Med (Lausanne). 2017 Jul 19;4:97. doi: 10.3389/fmed.2017.00097. eCollection 2017.

Using complex networks for refining survival prognosis in prostate cancer patient.

F1000Res. 2016 Nov 16;5:2675. doi: 10.12688/f1000research.8282.1. eCollection 2016.

Logical Analysis of Data in Structure-Activity Investigation of Polymeric Gene Delivery.

Macromol Theory Simul. 2011 May 23;20(4):275-285. doi: 10.1002/mats.201000087.

Predicting 5-Year Survival Status of Patients with Breast Cancer based on Supervised Wavelet Method.

Osong Public Health Res Perspect. 2014 Dec;5(6):324-32. doi: 10.1016/j.phrp.2014.09.002. Epub 2014 Nov 1.

Network-based inference framework for identifying cancer genes from gene expression data.

Biomed Res Int. 2013;2013:401649. doi: 10.1155/2013/401649. Epub 2013 Sep 1.

Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

PLoS One. 2012;7(7):e39932. doi: 10.1371/journal.pone.0039932. Epub 2012 Jul 11.

Comparative survival analysis of breast cancer microarray studies identifies important prognostic genetic pathways.

BMC Cancer. 2010 Oct 21;10:573. doi: 10.1186/1471-2407-10-573.

Avoiding Pitfalls in the Statistical Analysis of Heterogeneous Tumors.

Biomed Inform Insights. 2009 Jan 1;2:11-18. doi: 10.4137/bii.s2222.

本文引用的文献

Logical analysis of diffuse large B-cell lymphomas.

Artif Intell Med. 2005 Jul;34(3):235-67. doi: 10.1016/j.artmed.2004.11.004.

A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients.

Cancer Res. 2005 May 15;65(10):4059-66. doi: 10.1158/0008-5472.CAN-04-3953.

Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data.

Nucleic Acids Res. 2005 Jan 7;33(1):56-65. doi: 10.1093/nar/gki144. Print 2005.

A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer.

N Engl J Med. 2004 Dec 30;351(27):2817-26. doi: 10.1056/NEJMoa041588. Epub 2004 Dec 10.

Ovarian cancer detection by logical analysis of proteomic data.

Proteomics. 2004 Mar;4(3):766-83. doi: 10.1002/pmic.200300574.

A CART-based approach to discover emerging patterns in microarray data.

Bioinformatics. 2003 Dec 12;19(18):2465-72. doi: 10.1093/bioinformatics/btg361.

Gene expression profiles of primary breast tumors maintained in distant metastases.

Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15901-5. doi: 10.1073/pnas.2634067100. Epub 2003 Dec 9.

Linear regression and two-class classification with gene expression data.

Bioinformatics. 2003 Nov 1;19(16):2072-8. doi: 10.1093/bioinformatics/btg283.

Robust singular value decomposition analysis of microarray data.

Proc Natl Acad Sci U S A. 2003 Nov 11;100(23):13167-72. doi: 10.1073/pnas.1733249100. Epub 2003 Oct 27.

Breast cancer classification and prognosis based on gene expression profiles from a population-based study.

Proc Natl Acad Sci U S A. 2003 Sep 2;100(18):10393-8. doi: 10.1073/pnas.1732912100. Epub 2003 Aug 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过基因表达数据的组合分析预测乳腺癌预后

Breast cancer prognosis by combinatorial analysis of gene expression data.

作者信息

Alexe Gabriela, Alexe Sorin, Axelrod David E, Bonates Tibérius O, Lozina Irina I, Reiss Michael, Hammer Peter L

机构信息

RUTCOR (Rutgers University Center for Operations Research), Piscataway, New Jersey, USA.

通过基因表达数据的组合分析预测乳腺癌预后

Breast cancer prognosis by combinatorial analysis of gene expression data.

作者信息

机构信息

出版信息

INTRODUCTION

METHOD

RESULTS

CONCLUSION

引言

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过基因表达数据的组合分析预测乳腺癌预后

Breast cancer prognosis by combinatorial analysis of gene expression data.

作者信息

机构信息

出版信息

INTRODUCTION

METHOD

RESULTS

CONCLUSION

引言

方法

结果

结论