Suppr超能文献

netReg:用于生物关联研究的网络正则化线性模型。

netReg: network-regularized linear models for biological association studies.

机构信息

Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.

Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.

出版信息

Bioinformatics. 2018 Mar 1;34(5):896-898. doi: 10.1093/bioinformatics/btx677.

Abstract

SUMMARY

Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.g. in the form of graphs, arguing that this information will lead to more reliable estimates for regression coefficients. However, none of the proposed models for multivariate genomic response variables have been implemented as a computationally efficient, freely available library. In this paper we propose netReg, a package for graph-penalized regression models that use large networks and thousands of variables. netReg incorporates a priori generated biological graph information into linear models yielding sparse or smooth solutions for regression coefficients.

AVAILABILITY AND IMPLEMENTATION

netReg is implemented as both R-package and C ++ commandline tool. The main computations are done in C ++, where we use Armadillo for fast matrix calculations and Dlib for optimization. The R package is freely available on Bioconductorhttps://bioconductor.org/packages/netReg. The command line tool can be installed using the conda channel Bioconda. Installation details, issue reports, development versions, documentation and tutorials for the R and C ++ versions and the R package vignette can be found on GitHub https://dirmeier.github.io/netReg/. The GitHub page also contains code for benchmarking and example datasets used in this paper.

CONTACT

simon.dirmeier@bsse.ethz.ch.

摘要

摘要

当分析的数据集是高维的,并且可用的观测值少于变量时(n ≪ p),使用线性回归来模拟生物关联或依赖性通常会变得复杂。为此,已经应用了惩罚回归方法来处理基因组数据集。最近提出的回归模型利用了依赖性的先验知识,例如以图形的形式,认为这些信息将为回归系数提供更可靠的估计。然而,用于多变量基因组响应变量的提出的模型中,没有一个被实现为计算效率高、免费可用的库。在本文中,我们提出了 netReg,这是一个用于图形惩罚回归模型的软件包,该模型使用大型网络和数千个变量。netReg 将先验生成的生物图形信息纳入线性模型中,为回归系数生成稀疏或平滑的解决方案。

可用性和实现

netReg 既作为 R 包,也作为 C++命令行工具实现。主要计算在 C++中完成,我们在其中使用 Armadillo 进行快速矩阵计算和 Dlib 进行优化。R 包可在 Bioconductorhttps://bioconductor.org/packages/netReg 上免费获得。命令行工具可以使用 conda 频道 Bioconda 安装。有关 R 和 C++版本以及 R 包 vignette 的安装详细信息、问题报告、开发版本、文档和教程都可以在 GitHub https://dirmeier.github.io/netReg/ 上找到。GitHub 页面还包含本文中使用的基准测试和示例数据集的代码。

联系方式

simon.dirmeier@bsse.ethz.ch

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/420b/6030897/f84a27e9a5a9/btx677f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验