School of Public Health, Yale University, New Haven, CT 06520, USA.
BMC Bioinformatics. 2010 May 20;11:271. doi: 10.1186/1471-2105-11-271.
Extensive biomedical studies have shown that clinical and environmental risk factors may not have sufficient predictive power for cancer prognosis. The development of high-throughput profiling technologies makes it possible to survey the whole genome and search for genomic markers with predictive power. Many existing studies assume the interchangeability of gene effects and ignore the coordination among them.
We adopt the weighted co-expression network to describe the interplay among genes. Although there are several different ways of defining gene networks, the weighted co-expression network may be preferred because of its computational simplicity, satisfactory empirical performance, and because it does not demand additional biological experiments. For cancer prognosis studies with gene expression measurements, we propose a new marker selection method that can properly incorporate the network connectivity of genes. We analyze six prognosis studies on breast cancer and lymphoma. We find that the proposed approach can identify genes that are significantly different from those using alternatives. We search published literature and find that genes identified using the proposed approach are biologically meaningful. In addition, they have better prediction performance and reproducibility than genes identified using alternatives.
The network contains important information on the functionality of genes. Incorporating the network structure can improve cancer marker identification.
大量的生物医学研究表明,临床和环境风险因素可能对癌症预后没有足够的预测能力。高通量分析技术的发展使得调查整个基因组并寻找具有预测能力的基因组标记成为可能。许多现有的研究假设基因作用的可互换性,并忽略了它们之间的协调。
我们采用加权共表达网络来描述基因之间的相互作用。尽管有几种不同的定义基因网络的方法,但加权共表达网络可能更受欢迎,因为它计算简单、经验表现令人满意,而且不需要额外的生物学实验。对于具有基因表达测量的癌症预后研究,我们提出了一种新的标记选择方法,可以适当地纳入基因的网络连通性。我们分析了六项关于乳腺癌和淋巴瘤的预后研究。我们发现,所提出的方法可以识别与替代方法显著不同的基因。我们搜索已发表的文献,发现使用所提出的方法识别的基因在生物学上是有意义的。此外,与使用替代方法识别的基因相比,它们具有更好的预测性能和可重复性。
网络包含有关基因功能的重要信息。纳入网络结构可以提高癌症标志物的识别能力。