• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

避免基因网络L1正则化推断中的陷阱。

Avoiding pitfalls in L1-regularised inference of gene networks.

作者信息

Tjärnberg Andreas, Nordling Torbjörn E M, Studham Matthew, Nelander Sven, Sonnhammer Erik L L

机构信息

Stockholm Bioinformatics Centre, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden.

出版信息

Mol Biosyst. 2015 Jan;11(1):287-96. doi: 10.1039/c4mb00419a. Epub 2014 Nov 7.

DOI:10.1039/c4mb00419a
PMID:25377664
Abstract

Statistical regularisation methods such as LASSO and related L1 regularised regression methods are commonly used to construct models of gene regulatory networks. Although they can theoretically infer the correct network structure, they have been shown in practice to make errors, i.e. leave out existing links and include non-existing links. We show that L1 regularisation methods typically produce a poor network model when the analysed data are ill-conditioned, i.e. the gene expression data matrix has a high condition number, even if it contains enough information for correct network inference. However, the correct structure of network models can be obtained for informative data, data with such a signal to noise ratio that existing links can be proven to exist, when these methods fail, by using least-squares regression and setting small parameters to zero, or by using robust network inference, a recent method taking the intersection of all non-rejectable models. Since available experimental data sets are generally ill-conditioned, we recommend to check the condition number of the data matrix to avoid this pitfall of L1 regularised inference, and to also consider alternative methods.

摘要

诸如LASSO及相关的L1正则化回归方法等统计正则化方法通常用于构建基因调控网络模型。尽管它们在理论上能够推断出正确的网络结构,但实践表明它们会出错,即遗漏现有链接并包含不存在的链接。我们表明,当分析的数据病态时,即基因表达数据矩阵的条件数很高时,即使它包含足够的信息用于正确的网络推断,L1正则化方法通常也会产生较差的网络模型。然而,当这些方法失效时,对于信息丰富的数据,即具有能证明现有链接存在的信噪比的数据,通过使用最小二乘回归并将小参数设置为零,或者通过使用稳健网络推断(一种采用所有不可拒绝模型交集的最新方法),可以获得网络模型的正确结构。由于现有的实验数据集通常是病态的,我们建议检查数据矩阵的条件数,以避免L1正则化推断的这个陷阱,并且还应考虑替代方法。

相似文献

1
Avoiding pitfalls in L1-regularised inference of gene networks.避免基因网络L1正则化推断中的陷阱。
Mol Biosyst. 2015 Jan;11(1):287-96. doi: 10.1039/c4mb00419a. Epub 2014 Nov 7.
2
Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods.通过LASSO型正则化方法整合ChIP-seq/chip和转录组数据来推断基因调控网络。
Methods. 2014 Jun 1;67(3):294-303. doi: 10.1016/j.ymeth.2014.03.006. Epub 2014 Mar 17.
3
Automatic kernel regression modelling using combined leave-one-out test score and regularised orthogonal least squares.使用留一法检验分数与正则化正交最小二乘法相结合的自动核回归建模
Int J Neural Syst. 2004 Feb;14(1):27-37. doi: 10.1142/S0129065704001875.
4
Biological Network Inference and analysis using SEBINI and CABIN.使用SEBINI和CABIN进行生物网络推断与分析。
Methods Mol Biol. 2009;541:551-76. doi: 10.1007/978-1-59745-243-4_24.
5
Weighted-LASSO for structured network inference from time course data.用于从时间序列数据进行结构化网络推断的加权套索算法
Stat Appl Genet Mol Biol. 2010;9:Article 15. doi: 10.2202/1544-6115.1519. Epub 2010 Feb 1.
6
Exploring the operational characteristics of inference algorithms for transcriptional networks by means of synthetic data.利用合成数据探索转录网络推理算法的运行特性。
Artif Life. 2008 Winter;14(1):49-63. doi: 10.1162/artl.2008.14.1.49.
7
Selective integration of multiple biological data for supervised network inference.用于监督网络推理的多生物数据的选择性整合。
Bioinformatics. 2005 May 15;21(10):2488-95. doi: 10.1093/bioinformatics/bti339. Epub 2005 Feb 22.
8
Optimal sparsity criteria for network inference.网络推理的最优稀疏性标准。
J Comput Biol. 2013 May;20(5):398-408. doi: 10.1089/cmb.2012.0268.
9
A new multiple regression approach for the construction of genetic regulatory networks.一种新的用于构建遗传调控网络的多元回归方法。
Artif Intell Med. 2010 Feb-Mar;48(2-3):153-60. doi: 10.1016/j.artmed.2009.11.001. Epub 2009 Dec 5.
10
Parametric inference in the large data limit using maximally informative models.在大数据限制下使用信息量最大化模型进行参数推断。
Neural Comput. 2014 Apr;26(4):637-53. doi: 10.1162/NECO_a_00568. Epub 2014 Jan 30.

引用本文的文献

1
Machine learning methods for gene regulatory network inference.
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf470.
2
BiGSM: Bayesian inference of gene regulatory network via sparse modelling.BiGSM:通过稀疏建模进行基因调控网络的贝叶斯推断
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf318.
3
Gene regulatory network analysis identifies MYL1, MDH2, GLS, and TRIM28 as the principal proteins in the response of mesenchymal stem cells to Mg ions.基因调控网络分析确定MYL1、MDH2、GLS和TRIM28为间充质干细胞对镁离子反应中的主要蛋白质。
Comput Struct Biotechnol J. 2024 Apr 14;23:1773-1785. doi: 10.1016/j.csbj.2024.04.033. eCollection 2024 Dec.
4
Knowledge of the perturbation design is essential for accurate gene regulatory network inference.了解扰动设计对于准确推断基因调控网络至关重要。
Sci Rep. 2022 Oct 3;12(1):16531. doi: 10.1038/s41598-022-19005-x.
5
Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference.基于信息准则的最优稀疏性选择用于准确的基因调控网络推断
Front Genet. 2022 Jul 13;13:855770. doi: 10.3389/fgene.2022.855770. eCollection 2022.
6
Uncovering cancer gene regulation by accurate regulatory network inference from uninformative data.从无信息数据中准确推断调控网络以揭示癌症基因调控。
NPJ Syst Biol Appl. 2020 Nov 9;6(1):37. doi: 10.1038/s41540-020-00154-6.
7
Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms.基于扰动的基因调控网络推断揭示致癌机制。
Sci Rep. 2020 Aug 25;10(1):14149. doi: 10.1038/s41598-020-70941-y.
8
LiPLike: towards gene regulatory network predictions of high certainty.LiPLike:实现高精度基因调控网络预测。
Bioinformatics. 2020 Apr 15;36(8):2522-2529. doi: 10.1093/bioinformatics/btz950.
9
LASSIM-A network inference toolbox for genome-wide mechanistic modeling.LASSIM——用于全基因组机制建模的网络推理工具箱。
PLoS Comput Biol. 2017 Jun 22;13(6):e1005608. doi: 10.1371/journal.pcbi.1005608. eCollection 2017 Jun.
10
The DIONESUS algorithm provides scalable and accurate reconstruction of dynamic phosphoproteomic networks to reveal new drug targets.狄俄尼索斯算法可对动态磷酸化蛋白质组网络进行可扩展且准确的重建,以揭示新的药物靶点。
Integr Biol (Camb). 2015 Jul;7(7):776-91. doi: 10.1039/c5ib00065c.