IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2483-2491. doi: 10.1109/TCBB.2020.2973148. Epub 2021 Dec 8.
Breast cancer is a heterogeneous disease with many clinically distinguishable molecular subtypes each corresponding to a cluster of patients. Identification of prognostic and heterogeneous biomarkers for breast cancer is to detect cluster-specific gene biomarkers which can be used for accurate survival prediction of breast cancer outcomes. In this study, we proposed a FUsion Network-based method (FUNMarker) to identify prognostic and heterogeneous breast cancer biomarkers by considering the heterogeneity of patient samples and biological information from multiple sources. To reduce the affect of heterogeneity of patients, samples were first clustered using the K-means algorithm based on the principal components of gene expression. For each cluster, to comprehensively evaluate the influence of genes on breast cancer, genes were weighted from three aspects: biological function, prognostic ability and correlation with known disease genes. Then they were ranked via a label propagation model on a fusion network that combined physical protein interactions from seven types of networks and thus could reduce the impact of incompleteness of interactome. We compared FUNMarker with three state-of-the-art methods and the results showed that biomarkers identified by FUNMarker were biological interpretable and had stronger discriminative power than the existing methods in differentiating patients with different prognostic outcomes.
乳腺癌是一种具有许多临床可区分分子亚型的异质性疾病,每个亚型对应一组患者。鉴定用于乳腺癌的预后和异质生物标志物是为了检测特定于簇的基因生物标志物,这些标志物可用于乳腺癌结果的准确生存预测。在这项研究中,我们提出了一种基于融合网络的方法(FUNMarker),通过考虑患者样本和来自多个来源的生物学信息的异质性来识别预后和异质的乳腺癌生物标志物。为了减少患者异质性的影响,首先使用基于基因表达主成分的 K-均值算法对样本进行聚类。对于每个聚类,为了全面评估基因对乳腺癌的影响,从三个方面对基因进行加权:生物学功能、预后能力和与已知疾病基因的相关性。然后,通过融合网络上的标签传播模型对其进行排名,该融合网络结合了来自七种网络的物理蛋白质相互作用,从而可以减少互作组不完整的影响。我们将 FUNMarker 与三种最先进的方法进行了比较,结果表明,FUNMarker 识别的生物标志物在区分预后不同的患者方面具有更强的区分能力,且比现有方法更具生物学可解释性。