Department of Biomedical Informatics, The Ohio State University, OH, USA.
BMC Bioinformatics. 2010 Oct 28;11 Suppl 9(Suppl 9):S5. doi: 10.1186/1471-2105-11-S9-S5.
Chronic lymphocytic leukemia (CLL) is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH) mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVH mutational status which can accurately predict the survival outcome are yet to be discovered.
In this paper, we investigate the use of gene co-expression network analysis to identify potential biomarkers for CLL. Specifically we focused on the co-expression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO) and used the frequent network mining algorithm CODENSE to identify highly connected gene co-expression networks spanning the entire genome, then evaluated the genes in the co-expression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 co-expression network.
We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were cross-validated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.
慢性淋巴细胞白血病(CLL)是最常见的成人白血病。它是一种高度异质性疾病,可以根据经典的临床标志物大致分为惰性和进展性阶段。免疫球蛋白重链可变区(IgVH)突变状态与患者的生存结局相关,与 IgVH 状态相关的生物标志物一直是 CLL 预后研究领域的焦点。然而,与 IgVH 突变状态高度相关并能准确预测生存结局的生物标志物尚未被发现。
在本文中,我们研究了使用基因共表达网络分析来识别 CLL 的潜在生物标志物。具体来说,我们专注于涉及 ZAP70 的共表达网络,ZAP70 是 CLL 的一个特征明确的生物标志物。我们从基因表达综合数据库(GEO)中选择了 23 个对应多种癌症类型的微阵列数据集,并使用频繁网络挖掘算法 CODENSE 来识别跨越整个基因组的高度连接的基因共表达网络,然后评估了 ZAP70 参与的共表达网络中的基因。然后,我们应用了一组特征选择方法,从 ZAP70 共表达网络中进一步选择能够预测 IgVH 突变状态的基因。
我们已经确定了一组潜在的 CLL 预后生物标志物基因,包括 IL2RB、CD8A、CD247、LAG3 和 KLRK1,它们可以以高准确度预测 CLL 患者的 IgVH 突变状态。我们通过应用这些生物标志物候选物将 CLL 微阵列数据集与临床信息相结合,将患者分为不同的结局组,对这些生物标志物的预后能力进行了交叉验证。