• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生物学权重的 LASSO 模型:提升基因表达数据分析中功能可解释性。

Biologically weighted LASSO: enhancing functional interpretability in gene expression data analysis.

机构信息

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, Milan 20133, Italy.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae605.

DOI:10.1093/bioinformatics/btae605
PMID:39412436
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11639179/
Abstract

MOTIVATION

Feature selection approaches are widely used in gene expression data analysis to identify the most relevant features and boost performance in regression and classification tasks. However, such algorithms solely consider each feature's quantitative contribution to the task, possibly limiting the biological interpretability of the results. Feature-related prior knowledge, such as functional annotations and pathways information, can be incorporated into feature selection algorithms to potentially improve model performance and interpretability.

RESULTS

We propose an embedded integrative approach to feature selection that combines weighted LASSO feature selection and prior biological knowledge in a single step, by means of a novel score of biological relevance that summarizes information extracted from popular biological knowledge bases. Findings from the performed experiments indicate that our proposed approach is able to identify the most predictive genes while simultaneously enhancing the biological interpretability of the results compared to the standard LASSO regularized model.

AVAILABILITY AND IMPLEMENTATION

Code is available at https://github.com/DEIB-GECO/GIS-weigthed_LASSO.

摘要

动机

特征选择方法在基因表达数据分析中被广泛应用,以识别最相关的特征,并在回归和分类任务中提高性能。然而,这些算法仅考虑每个特征对任务的定量贡献,可能限制了结果的生物学可解释性。特征相关的先验知识,如功能注释和途径信息,可以被纳入特征选择算法中,以潜在地提高模型性能和可解释性。

结果

我们提出了一种嵌入式综合特征选择方法,通过一种新的生物学相关性评分,将加权 LASSO 特征选择和单一步骤中的先验生物学知识相结合,该评分综合了从流行的生物学知识库中提取的信息。所进行的实验结果表明,与标准的 LASSO 正则化模型相比,我们提出的方法能够识别出最具预测性的基因,同时增强了结果的生物学可解释性。

可用性和实现

代码可在 https://github.com/DEIB-GECO/GIS-weigthed_LASSO 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d086/11639179/751e8df5534c/btae605f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d086/11639179/751e8df5534c/btae605f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d086/11639179/751e8df5534c/btae605f1.jpg

相似文献

1
Biologically weighted LASSO: enhancing functional interpretability in gene expression data analysis.基于生物学权重的 LASSO 模型:提升基因表达数据分析中功能可解释性。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae605.
2
Optimizing Model Performance and Interpretability: Application to Biological Data Classification.优化模型性能与可解释性:在生物数据分类中的应用
Genes (Basel). 2025 Feb 28;16(3):297. doi: 10.3390/genes16030297.
3
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases.无监督构建具有显式结构归纳偏差的基因表达数据的计算图。
Bioinformatics. 2022 Feb 7;38(5):1320-1327. doi: 10.1093/bioinformatics/btab830.
4
Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO.利用差异加权图形套索法,将先验生物学知识纳入基于网络的差异基因表达分析。
BMC Bioinformatics. 2017 Feb 10;18(1):99. doi: 10.1186/s12859-017-1515-1.
5
A consensus multi-view multi-objective gene selection approach for improved sample classification.一种共识多视角多目标基因选择方法,用于提高样本分类。
BMC Bioinformatics. 2020 Sep 17;21(Suppl 13):386. doi: 10.1186/s12859-020-03681-5.
6
Improving the performance and interpretability on medical datasets using graphical ensemble feature selection.使用图形集成特征选择提高医学数据集的性能和可解释性。
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae341.
7
Network-Regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery.用于临床风险预测和生物标志物发现的基于网络正则化稀疏逻辑回归模型。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):944-953. doi: 10.1109/TCBB.2016.2640303. Epub 2016 Dec 15.
8
EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits.EPS-LASSO:连续性状极端表型抽样下的高维回归检验。
Bioinformatics. 2018 Jun 15;34(12):1996-2003. doi: 10.1093/bioinformatics/bty042.
9
Systematic identification of feature combinations for predicting drug response with Bayesian multi-view multi-task linear regression.基于贝叶斯多视图多任务线性回归的药物反应预测特征组合的系统识别。
Bioinformatics. 2017 Jul 15;33(14):i359-i368. doi: 10.1093/bioinformatics/btx266.
10
Weighted General Group Lasso for Gene Selection in Cancer Classification.加权广义群组套索在癌症分类中的基因选择。
IEEE Trans Cybern. 2019 Aug;49(8):2860-2873. doi: 10.1109/TCYB.2018.2829811. Epub 2018 May 10.

本文引用的文献

1
TSPLASSO: A Two-stage Prior LASSO Algorithm for Gene Selection using Omics Data.TSPLASSO:一种使用组学数据进行基因选择的两阶段先验LASSO算法。
IEEE J Biomed Health Inform. 2023 Oct 23;PP. doi: 10.1109/JBHI.2023.3326485.
2
RGMQL: scalable and interoperable computing of heterogeneous omics big data and metadata in R/Bioconductor.RGMQL:在 R/Bioconductor 中可扩展和互操作的异构组学大数据和元数据的计算。
BMC Bioinformatics. 2022 Apr 7;23(1):123. doi: 10.1186/s12859-022-04648-4.
3
Incorporating prior knowledge into regularized regression.
将先验知识纳入正则化回归。
Bioinformatics. 2021 May 1;37(4):514-521. doi: 10.1093/bioinformatics/btaa776.
4
Unsupervised gene selection using biological knowledge : application in sample clustering.利用生物学知识进行无监督基因选择:在样本聚类中的应用
BMC Bioinformatics. 2017 Nov 22;18(1):513. doi: 10.1186/s12859-017-1933-0.
5
The Cancer Genome Atlas Pan-Cancer analysis project.癌症基因组图谱泛癌分析项目。
Nat Genet. 2013 Oct;45(10):1113-20. doi: 10.1038/ng.2764.
6
Measuring gene functional similarity based on group-wise comparison of GO terms.基于 GO 术语的组间比较来衡量基因功能相似性。
Bioinformatics. 2013 Jun 1;29(11):1424-32. doi: 10.1093/bioinformatics/btt160. Epub 2013 Apr 9.
7
Weighted lasso with data integration.具有数据整合功能的加权套索法
Stat Appl Genet Mol Biol. 2011 Aug 29;10(1):/j/sagmb.2011.10.issue-1/sagmb.2011.10.1.1703/sagmb.2011.10.1.1703.xml. doi: 10.2202/1544-6115.1703.
8
SoFoCles: feature filtering for microarray classification based on gene ontology.SoFoCles:基于基因本体论的微阵列分类特征过滤。
J Biomed Inform. 2010 Feb;43(1):1-14. doi: 10.1016/j.jbi.2009.06.002. Epub 2009 Jul 1.
9
The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease.人类表型本体论:一种用于注释和分析人类遗传病的工具。
Am J Hum Genet. 2008 Nov;83(5):610-5. doi: 10.1016/j.ajhg.2008.09.017. Epub 2008 Oct 23.
10
Using mutual information for selecting features in supervised neural net learning.在监督式神经网络学习中使用互信息来选择特征。
IEEE Trans Neural Netw. 1994;5(4):537-50. doi: 10.1109/72.298224.