Xin Jingxue, Ren Xianwen, Chen Luonan, Wang Yong
BMC Med Genomics. 2015;8 Suppl 2(Suppl 2):S11. doi: 10.1186/1755-8794-8-S2-S11. Epub 2015 May 29.
Identifying effective biomarkers to battle complex diseases is an important but challenging task in biomedical research today. Molecular data of complex diseases is increasingly abundant due to the rapid advance of high throughput technologies. However, a great gap remains in identifying the massive molecular data to phenotypic changes, in particular, at a network level, i.e., a novel method for identifying network biomarkers is in pressing need to accurately classify and diagnose diseases from molecular data and shed light on the mechanisms of disease pathogenesis. Rather than seeking differential genes at an individual-molecule level, here we propose a novel method for identifying network biomarkers based on protein-protein interaction affinity (PPIA), which identify the differential interactions at a network level. Specifically, we firstly define PPIAs by estimating the concentrations of protein complexes based on the law of mass action upon gene expression data. Then we select a small and non-redundant group of protein-protein interactions and single proteins according to the PPIAs, that maximizes the discerning ability of cases from controls. This method is mathematically formulated as a linear programming, which can be efficiently solved and guarantees a globally optimal solution. Extensive results on experimental data in breast cancer demonstrate the effectiveness and efficiency of the proposed method for identifying network biomarkers, which not only can accurately distinguish the phenotypes but also provides significant biological insights at a network or pathway level. In addition, our method provides a new way to integrate static protein-protein interaction information with dynamical gene expression data.
识别有效的生物标志物以对抗复杂疾病是当今生物医学研究中的一项重要但具有挑战性的任务。由于高通量技术的迅速发展,复杂疾病的分子数据日益丰富。然而,在将海量分子数据与表型变化联系起来方面,尤其是在网络层面,仍存在巨大差距,也就是说,迫切需要一种识别网络生物标志物的新方法,以便从分子数据中准确地对疾病进行分类和诊断,并揭示疾病发病机制。我们不是在单个分子层面寻找差异基因,而是提出了一种基于蛋白质 - 蛋白质相互作用亲和力(PPIA)识别网络生物标志物的新方法,该方法在网络层面识别差异相互作用。具体而言,我们首先根据质量作用定律,基于基因表达数据估计蛋白质复合物的浓度来定义PPIA。然后根据PPIA选择一小群非冗余的蛋白质 - 蛋白质相互作用和单个蛋白质,以最大化病例与对照之间的辨别能力。该方法在数学上被表述为一个线性规划问题,它可以被有效地求解并保证得到全局最优解。在乳腺癌实验数据上的大量结果证明了所提出的识别网络生物标志物方法的有效性和高效性,该方法不仅可以准确区分表型,还能在网络或通路层面提供重要的生物学见解。此外,我们的方法为整合静态蛋白质 - 蛋白质相互作用信息与动态基因表达数据提供了一种新途径。