Abruzzo Lynne V, Barron Lynn L, Anderson Keith, Newman Rachel J, Wierda William G, O'brien Susan, Ferrajoli Alessandra, Luthra Madan, Talwalkar Sameer, Luthra Rajyalakshmi, Jones Dan, Keating Michael J, Coombes Kevin R
University of Texas M.D. Anderson Cancer Center, Department of Hematopathology, Box 72, 1515 Holcombe Blvd., Houston, TX 77030, USA.
J Mol Diagn. 2007 Sep;9(4):546-55. doi: 10.2353/jmoldx.2007.070001. Epub 2007 Aug 9.
To develop a model incorporating relevant prognostic biomarkers for untreated chronic lymphocytic leukemia patients, we re-analyzed the raw data from four published gene expression profiling studies. We selected 88 candidate biomarkers linked to immunoglobulin heavy-chain variable region gene (IgV(H)) mutation status and produced a reliable and reproducible microfluidics quantitative real-time polymerase chain reaction array. We applied this array to a training set of 29 purified samples from previously untreated patients. In an unsupervised analysis, the samples clustered into two groups. Using a cutoff point of 2% homology to the germline IgV(H) sequence, one group contained all 14 IgV(H)-unmutated samples; the other contained all 15 mutated samples. We confirmed the differential expression of 37 of the candidate biomarkers using two-sample t-tests. Next, we constructed 16 different models to predict IgV(H) mutation status and evaluated their performance on an independent test set of 20 new samples. Nine models correctly classified 11 of 11 IgV(H)-mutated cases and eight of nine IgV(H)-unmutated cases, with some models using three to seven genes. Thus, we can classify cases with 95% accuracy based on the expression of as few as three genes.
为了开发一个包含未治疗的慢性淋巴细胞白血病患者相关预后生物标志物的模型,我们重新分析了四项已发表的基因表达谱研究的原始数据。我们选择了88个与免疫球蛋白重链可变区基因(IgV(H))突变状态相关的候选生物标志物,并制作了一个可靠且可重复的微流控定量实时聚合酶链反应阵列。我们将此阵列应用于来自先前未治疗患者的29个纯化样本的训练集。在无监督分析中,样本聚为两组。使用与胚系IgV(H)序列2%同源性的截止点,一组包含所有14个IgV(H)未突变样本;另一组包含所有15个突变样本。我们使用双样本t检验确认了37个候选生物标志物的差异表达。接下来,我们构建了16种不同模型来预测IgV(H)突变状态,并在20个新样本的独立测试集上评估它们的性能。九个模型正确分类了11个IgV(H)突变病例中的11个以及9个IgV(H)未突变病例中的8个,一些模型使用了三到七个基因。因此,我们可以基于少至三个基因的表达以95%的准确率对病例进行分类。