IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):760-773. doi: 10.1109/TCBB.2017.2665495. Epub 2017 Feb 7.
A major challenge of genomics data is to detect interactions displaying functional associations from large-scale observations. In this study, a new cPLS-algorithm combining partial least squares approach with negative binomial regression is suggested to reconstruct a genomic association network for high-dimensional next-generation sequencing count data. The suggested approach is applicable to the raw counts data, without requiring any further pre-processing steps. In the settings investigated, the cPLS-algorithm outperformed the two widely used comparative methods, graphical lasso, and weighted correlation network analysis. In addition, cPLS is able to estimate the full network for thousands of genes without major computational load. Finally, we demonstrate that cPLS is capable of finding biologically meaningful associations by analyzing an example data set from a previously published study to examine the molecular anatomy of the craniofacial development.
基因组学数据的一个主要挑战是从大规模观测中检测显示功能关联的相互作用。在这项研究中,我们提出了一种新的 cPLS 算法,该算法将偏最小二乘方法与负二项式回归相结合,用于重建用于高维下一代测序计数数据的基因组关联网络。所提出的方法适用于原始计数数据,而无需任何进一步的预处理步骤。在所研究的环境中,cPLS 算法优于两种广泛使用的比较方法,即图形套索和加权相关网络分析。此外,cPLS 能够在没有主要计算负载的情况下估计数千个基因的完整网络。最后,我们通过分析先前发表的研究中的一个示例数据集来证明 cPLS 能够找到具有生物学意义的关联,以检查颅面发育的分子解剖结构。