Koo Imhoi, Yao Sen, Zhang Xiang, Kim Seongho
Department of Chemistry, University of Louisville, Louisville, Kentucky 40292, USA.
J Bioinform Comput Biol. 2014 Aug;12(4):1450018. doi: 10.1142/S0219720014500188. Epub 2014 Aug 7.
Gaussian graphical model (GGM)-based method, a key approach to reverse engineering biological networks, uses partial correlation to measure conditional dependence between two variables by controlling the contribution from other variables. After estimating partial correlation coefficients, one of the most critical processes in network construction is to control the false discovery rate (FDR) to assess the significant associations among variables. Various FDR methods have been proposed mainly for biomarker discovery, but it still remains unclear which FDR method performs better for network construction. Furthermore, there is no study to see the effect of the network structure on network construction. We selected the six FDR methods, the linear step-up procedure (BH95), the adaptive linear step-up procedure (BH00), Efron's local FDR (LFDR), Benjamini-Yekutieli's step-up procedure (BY01), Storey's q-value procedure (Storey01), and Storey-Taylor-Siegmund's adaptive step-up procedure (STS04), to evaluate their performances on network construction. We further considered two network structures, random and scale-free networks, to investigate their influence on network construction. Both simulated data and real experimental data suggest that STS04 provides the highest true positive rate (TPR) or F1 score, while BY01 has the highest positive predictive value (PPV) in network construction. In addition, no significant effect of the network structure is found on FDR methods.
基于高斯图形模型(GGM)的方法是逆向工程生物网络的关键方法,它通过控制其他变量的贡献,使用偏相关来衡量两个变量之间的条件依赖性。在估计偏相关系数之后,网络构建中最关键的过程之一是控制错误发现率(FDR),以评估变量之间的显著关联。已经提出了各种FDR方法,主要用于生物标志物发现,但对于网络构建而言,哪种FDR方法表现更好仍不清楚。此外,尚无研究考察网络结构对网络构建的影响。我们选择了六种FDR方法,即线性逐步程序(BH95)、自适应线性逐步程序(BH00)、埃弗龙的局部FDR(LFDR)、本雅明尼 - 耶库蒂利的逐步程序(BY01)、斯托里的q值程序(Storey01)以及斯托里 - 泰勒 - 西格蒙德的自适应逐步程序(STS04),来评估它们在网络构建方面的性能。我们进一步考虑了两种网络结构,即随机网络和无标度网络,以研究它们对网络构建的影响。模拟数据和实际实验数据均表明,在网络构建中,STS04具有最高的真阳性率(TPR)或F1分数,而BY01具有最高的阳性预测值(PPV)。此外,未发现网络结构对FDR方法有显著影响。
J Bioinform Comput Biol. 2014-8
BMC Bioinformatics. 2017-1-3
BMC Bioinformatics. 2008-2-25
PLoS Comput Biol. 2018-8-13
Chemometr Intell Lab Syst. 2014-11-15
Genes (Basel). 2018-5-23
IEEE/ACM Trans Comput Biol Bioinform. 2010
BMC Bioinformatics. 2009-11-24
Bioinformatics. 2008-6-15
BMC Bioinformatics. 2008-2-25
Biostatistics. 2008-7