通过一种新的稳健网络聚类算法发现癌症亚型和鉴定生物标志物。

Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.

机构信息

Center for Computer Vision and Department of Mathematics, Sun Yat-Sen University, Guangzhou, China.

出版信息

PLoS One. 2013 Jun 17;8(6):e66256. doi: 10.1371/journal.pone.0066256. Print 2013.

DOI:10.1371/journal.pone.0066256

PMID:23799085

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3684607/

Abstract

In cancer biology, it is very important to understand the phenotypic changes of the patients and discover new cancer subtypes. Recently, microarray-based technologies have shed light on this problem based on gene expression profiles which may contain outliers due to either chemical or electrical reasons. These undiscovered subtypes may be heterogeneous with respect to underlying networks or pathways, and are related with only a few of interdependent biomarkers. This motivates a need for the robust gene expression-based methods capable of discovering such subtypes, elucidating the corresponding network structures and identifying cancer related biomarkers. This study proposes a penalized model-based Student's t clustering with unconstrained covariance (PMT-UC) to discover cancer subtypes with cluster-specific networks, taking gene dependencies into account and having robustness against outliers. Meanwhile, biomarker identification and network reconstruction are achieved by imposing an adaptive [Formula: see text] penalty on the means and the inverse scale matrices. The model is fitted via the expectation maximization algorithm utilizing the graphical lasso. Here, a network-based gene selection criterion that identifies biomarkers not as individual genes but as subnetworks is applied. This allows us to implicate low discriminative biomarkers which play a central role in the subnetwork by interconnecting many differentially expressed genes, or have cluster-specific underlying network structures. Experiment results on simulated datasets and one available cancer dataset attest to the effectiveness, robustness of PMT-UC in cancer subtype discovering. Moveover, PMT-UC has the ability to select cancer related biomarkers which have been verified in biochemical or biomedical research and learn the biological significant correlation among genes.

摘要

在癌症生物学中，了解患者的表型变化并发现新的癌症亚型非常重要。最近，基于微阵列的技术根据基因表达谱揭示了这个问题，这些基因表达谱可能由于化学或电气原因而包含异常值。这些未被发现的亚型在潜在的网络或途径方面可能是异构的，并且仅与少数相互依赖的生物标志物相关。这就需要有稳健的基于基因表达的方法来发现这些亚型，阐明相应的网络结构，并识别癌症相关的生物标志物。本研究提出了一种基于惩罚模型的学生 t 聚类方法，该方法具有无约束协方差（PMT-UC），可以发现具有特定网络的癌症亚型，同时考虑基因依赖性，并具有对异常值的鲁棒性。同时，通过对均值和逆尺度矩阵施加自适应[Formula: see text]惩罚，实现了生物标志物的识别和网络重构。该模型通过利用图形套索的期望最大化算法进行拟合。这里，应用了一种基于网络的基因选择标准，该标准不是将生物标志物识别为单个基因，而是将其识别为子网络，从而可以确定在子网络中起中心作用的低判别性生物标志物，这些生物标志物通过连接许多差异表达的基因，或者具有特定于子网络的潜在网络结构。在模拟数据集和一个可用的癌症数据集上的实验结果证明了 PMT-UC 在癌症亚型发现中的有效性和鲁棒性。此外，PMT-UC 还具有选择已在生化或生物医学研究中得到验证的癌症相关生物标志物的能力，并学习基因之间具有生物学意义的相关性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eaeb/3684607/5b176938da96/pone.0066256.g001.jpg

相似文献

Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.通过一种新的稳健网络聚类算法发现癌症亚型和鉴定生物标志物。

PLoS One. 2013 Jun 17;8(6):e66256. doi: 10.1371/journal.pone.0066256. Print 2013.

GSNFS: Gene subnetwork biomarker identification of lung cancer expression data.GSNFS：肺癌表达数据的基因子网生物标志物识别

BMC Med Genomics. 2016 Dec 5;9(Suppl 3):70. doi: 10.1186/s12920-016-0231-4.

Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.基于微阵列数据的拉普拉斯朴素贝叶斯模型均值收缩的生物标志物识别和癌症分类。

IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.

A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression.基于基因表达的网络辅助协同聚类算法发现癌症亚型。

BMC Bioinformatics. 2014 Feb 4;15:37. doi: 10.1186/1471-2105-15-37.

Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO.利用差异加权图形套索法，将先验生物学知识纳入基于网络的差异基因表达分析。

BMC Bioinformatics. 2017 Feb 10;18(1):99. doi: 10.1186/s12859-017-1515-1.

Network clustering: probing biological heterogeneity by sparse graphical models.网络聚类：稀疏图模型探测生物异质性。

Bioinformatics. 2011 Apr 1;27(7):994-1000. doi: 10.1093/bioinformatics/btr070. Epub 2011 Feb 10.

Incorporating topological information for predicting robust cancer subnetwork markers in human protein-protein interaction network.整合拓扑信息以预测人类蛋白质-蛋白质相互作用网络中稳健的癌症子网标志物。

BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):351. doi: 10.1186/s12859-016-1224-1.

Identification of genes and pathways involved in kidney renal clear cell carcinoma.肾透明细胞癌相关基因和通路的鉴定

BMC Bioinformatics. 2014;15 Suppl 17(Suppl 17):S2. doi: 10.1186/1471-2105-15-S17-S2. Epub 2014 Dec 16.

Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering.基于拉普拉斯秩约束多视图聚类的癌症亚型识别。

Genes (Basel). 2021 Apr 3;12(4):526. doi: 10.3390/genes12040526.

A probabilistic approach for automated discovery of perturbed genes using expression data from microarray or RNA-Seq.一种使用来自微阵列或RNA测序的表达数据自动发现受干扰基因的概率方法。

Comput Biol Med. 2015 Dec 1;67:29-40. doi: 10.1016/j.compbiomed.2015.07.029. Epub 2015 Aug 14.

引用本文的文献

Genetic architecture of inter-specific and -generic grass hybrids by network analysis on multi-omics data.基于多组学数据的网络分析研究种间和属间杂交草的遗传结构。

BMC Genomics. 2023 Apr 25;24(1):213. doi: 10.1186/s12864-023-09292-7.

Pathway-based deep clustering for molecular subtyping of cancer.基于通路的深度聚类在癌症分子分型中的应用。

Methods. 2020 Feb 15;173:24-31. doi: 10.1016/j.ymeth.2019.06.017. Epub 2019 Jun 25.

Cancer Subtype Discovery Using Prognosis-Enhanced Neural Network Classifier in Multigenomic Data.在多基因组数据中使用预后增强神经网络分类器进行癌症亚型发现

Technol Cancer Res Treat. 2018 Jan 1;17:1533033818790509. doi: 10.1177/1533033818790509.

Estimation of multiple networks in Gaussian mixture models.高斯混合模型中多个网络的估计

Electron J Stat. 2016;10:1133-1154. doi: 10.1214/16-EJS1135. Epub 2016 May 2.

Joint -Norm Constraint and Graph-Laplacian PCA Method for Feature Extraction.联合 - 范数约束和图拉普拉斯主成分分析方法的特征提取。

Biomed Res Int. 2017;2017:5073427. doi: 10.1155/2017/5073427. Epub 2017 Apr 2.

Disease biomarker identification from gene network modules for metastasized breast cancer.从转移性乳腺癌的基因网络模块中识别疾病生物标志物。

Sci Rep. 2017 Apr 21;7(1):1072. doi: 10.1038/s41598-017-00996-x.

Differential network analysis from cross-platform gene expression data.基于跨平台基因表达数据的差异网络分析。

Sci Rep. 2016 Sep 28;6:34112. doi: 10.1038/srep34112.

An NMF-L2,1-Norm Constraint Method for Characteristic Gene Selection.一种用于特征基因选择的非负矩阵分解 - L2,1范数约束方法

PLoS One. 2016 Jul 18;11(7):e0158494. doi: 10.1371/journal.pone.0158494. eCollection 2016.

A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes.一种用于识别差异表达基因的P范数鲁棒特征提取方法。

PLoS One. 2015 Jul 22;10(7):e0133124. doi: 10.1371/journal.pone.0133124. eCollection 2015.

Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.通过数学规划优化框架进行多类疾病分类的通路活性推断

BMC Bioinformatics. 2014 Dec 5;15(1):390. doi: 10.1186/s12859-014-0390-2.

本文引用的文献

Model-based clustering with gene ranking using penalized mixtures of heavy-tailed distributions.基于模型的聚类分析，采用带惩罚的重尾分布混合模型进行基因排序。

J Bioinform Comput Biol. 2013 Jun;11(3):1341007. doi: 10.1142/S0219720013410072. Epub 2013 Mar 21.

Circulating microRNAs as specific biomarkers for breast cancer detection.循环 microRNAs 作为乳腺癌检测的特异性生物标志物。

PLoS One. 2013;8(1):e53141. doi: 10.1371/journal.pone.0053141. Epub 2013 Jan 3.

Multi-analyte network markers for tumor prognosis.多分析物网络标志物用于肿瘤预后。

PLoS One. 2012;7(12):e52973. doi: 10.1371/journal.pone.0052973. Epub 2012 Dec 26.

Network information improves cancer outcome prediction.网络信息可改善癌症预后预测。

Brief Bioinform. 2014 Jul;15(4):612-25. doi: 10.1093/bib/bbs083. Epub 2012 Dec 18.

SOX4 enables oncogenic survival signals in acute lymphoblastic leukemia.SOX4 使急性淋巴细胞白血病中的致癌生存信号得以激活。

Blood. 2013 Jan 3;121(1):148-55. doi: 10.1182/blood-2012-05-428938. Epub 2012 Nov 14.

Weighted frequent gene co-expression network mining to identify genes involved in genome stability.基于加权基因共表达网络挖掘的全基因组稳定性相关基因识别

PLoS Comput Biol. 2012;8(8):e1002656. doi: 10.1371/journal.pcbi.1002656. Epub 2012 Aug 30.

Exploring overlapping functional units with various structure in protein interaction networks.探索蛋白质相互作用网络中具有不同结构的重叠功能单元。

PLoS One. 2012;7(8):e43092. doi: 10.1371/journal.pone.0043092. Epub 2012 Aug 20.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.

Distinct genes related to drug response identified in ER positive and ER negative breast cancer cell lines.在 ER 阳性和 ER 阴性乳腺癌细胞系中鉴定出与药物反应相关的独特基因。

PLoS One. 2012;7(7):e40900. doi: 10.1371/journal.pone.0040900. Epub 2012 Jul 16.

Acidic leucine-rich nuclear phosphoprotein 32 family member B (ANP32B) contributes to retinoic acid-induced differentiation of leukemic cells.酸性亮氨酸丰富核磷蛋白 32 家族成员 B（ANP32B）有助于维甲酸诱导白血病细胞的分化。

Biochem Biophys Res Commun. 2012 Jul 13;423(4):721-5. doi: 10.1016/j.bbrc.2012.06.025. Epub 2012 Jun 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过一种新的稳健网络聚类算法发现癌症亚型和鉴定生物标志物。

Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献