Department of Mathematical Statistics, School of Statistics, Shandong University of Finance and Economics, Jinan 250014, China.
Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY 10016, USA.
Bioinformatics. 2017 Oct 1;33(19):3080-3087. doi: 10.1093/bioinformatics/btx360.
A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application.
We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data.
R scripts available at https://github.com/jijiadong/JDINAC.
Supplementary data are available at Bioinformatics online.
复杂疾病通常是由许多相互交织成网络的基因驱动的,而不是由单个基因产物驱动。网络比较或差异网络分析已成为揭示发病机制的潜在机制和识别疾病分类的临床生物标志物的重要手段。然而,大多数研究仅限于网络相关性,主要捕捉基因之间的线性关系,或依赖于基因测量的参数概率分布的假设。它们在实际应用中受到限制。
我们提出了一种新的基于联合密度的非参数差异互作用网络分析和分类(JDINAC)方法,用于识别两组之间网络激活的差异互作用模式。同时,JDINAC 使用网络生物标志物构建分类模型。JDINAC 的新颖之处在于它能够利用高维稀疏数据捕捉分子相互作用之间的非线性关系,并调整混杂因素,而无需假设基因测量的参数概率分布。模拟研究表明,JDINAC 提供了比其他最先进方法更准确的差异网络估计和更低的分类错误。我们将 JDINAC 应用于包含 114 名既有肿瘤又有匹配正常样本的乳腺浸润性癌数据集。所确定的枢纽基因和差异互作用模式与现有实验研究一致。此外,JDINAC 凭借所识别的生物标志物能够准确地区分肿瘤和正常样本。JDINAC 为使用高维稀疏组学数据进行特征选择和分类提供了一个通用框架。
可在 https://github.com/jijiadong/JDINAC 上获得 R 脚本。
补充数据可在生物信息学在线获得。