• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度大规模多任务学习网络用于基因表达推断。

Deep Large-Scale Multitask Learning Network for Gene Expression Inference.

机构信息

Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Department of Pediatrics, UPMC Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

出版信息

J Comput Biol. 2021 May;28(5):485-500. doi: 10.1089/cmb.2020.0438.

DOI:10.1089/cmb.2020.0438
PMID:34014778
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8165479/
Abstract

Gene expression profiling makes it possible to conduct many biological studies in a variety of fields due to its thorough characterization of cellular states under various experimental conditions. Despite recent advances in high-throughput technology, profiling an entire set of genomes is still difficult and expensive. Due to the high correlation between expression patterns of different genes, the aforementioned problem can be solved with a cost-effective approach that collects only a small subset of genes, called landmark genes, representing the entire set of genes, and infer the remaining genes, called target genes, using a computational model. There are several shallow and deep regression models in literature to estimate the expressions of target genes from the landmark genes. However, the shallow mostly have limited capacity in learning the nonlinear and complex gene expression data and are prone to underfitting, and the deep models generally do not take advantage of correlation among target genes in the learning process and suffer from overfitting. Considering the gene expression inference as a multitask learning problem, we propose a new deep multitask learning algorithm to tackle these issues. Our learning framework automatically learns the correlation between target genes and uses this knowledge to improve its generalization. Specifically, we utilize a subnetwork with low-dimensional latent variables to discover the relationships between target genes and enforce a seamless and easy to implement regularization to our deep regression model. Unlike the existing multitask learning methods that can only deal with dozens or hundreds of tasks, our algorithm is able to efficiently learn the relationships between ∼10,000 target genes and, thus, is scalable to a large number of tasks. Our proposed method outperforms the shallow and deep regression models for gene expression inference and alternative multitask learning algorithms on two large-scale datasets regardless of the network architecture.

摘要

基因表达谱分析通过全面描述各种实验条件下的细胞状态,使得在多个领域进行许多生物学研究成为可能。尽管高通量技术最近取得了进展,但对整个基因组进行分析仍然困难且昂贵。由于不同基因表达模式之间存在高度相关性,可以采用经济有效的方法来解决上述问题,该方法仅收集一小部分基因作为地标基因来代表整个基因集,并使用计算模型来推断其余基因,即目标基因。文献中有几种浅层和深层回归模型可用于从地标基因估计目标基因的表达。然而,浅层模型在学习非线性和复杂的基因表达数据方面能力有限,容易出现欠拟合,而深层模型在学习过程中通常无法利用目标基因之间的相关性,容易出现过拟合。考虑到基因表达推断是一个多任务学习问题,我们提出了一种新的深度多任务学习算法来解决这些问题。我们的学习框架自动学习目标基因之间的相关性,并利用这种知识来提高其泛化能力。具体来说,我们利用具有低维潜在变量的子网来发现目标基因之间的关系,并对我们的深度回归模型施加无缝且易于实现的正则化。与现有的多任务学习方法只能处理数十个或数百个任务不同,我们的算法能够有效地学习 10000 个左右目标基因之间的关系,因此可以扩展到大量任务。无论网络架构如何,我们的方法在两个大规模数据集上的基因表达推断和替代多任务学习算法上都优于浅层和深层回归模型。

相似文献

1
Deep Large-Scale Multitask Learning Network for Gene Expression Inference.深度大规模多任务学习网络用于基因表达推断。
J Comput Biol. 2021 May;28(5):485-500. doi: 10.1089/cmb.2020.0438.
2
D-GPM: A Deep Learning Method for Gene Promoter Methylation Inference.D-GPM:一种用于基因启动子甲基化推断的深度学习方法。
Genes (Basel). 2019 Oct 14;10(10):807. doi: 10.3390/genes10100807.
3
Conditional generative adversarial network for gene expression inference.条件生成对抗网络用于基因表达推断。
Bioinformatics. 2018 Sep 1;34(17):i603-i611. doi: 10.1093/bioinformatics/bty563.
4
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
5
SLIVER: Unveiling large scale gene regulatory networks of single-cell transcriptomic data through causal structure learning and modules aggregation.SLIVER:通过因果结构学习和模块聚合揭示单细胞转录组数据的大规模基因调控网络。
Comput Biol Med. 2024 Aug;178:108690. doi: 10.1016/j.compbiomed.2024.108690. Epub 2024 Jun 9.
6
Gene expression inference with deep learning.基于深度学习的基因表达推断
Bioinformatics. 2016 Jun 15;32(12):1832-9. doi: 10.1093/bioinformatics/btw074. Epub 2016 Feb 11.
7
An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection.基于候选自动选择的基因调控网络重建的改进贝叶斯网络方法。
BMC Genomics. 2017 Nov 17;18(Suppl 9):844. doi: 10.1186/s12864-017-4228-y.
8
Inferring latent task structure for Multitask Learning by Multiple Kernel Learning.通过多核学习推断多任务学习中的潜在任务结构。
BMC Bioinformatics. 2010 Oct 26;11 Suppl 8(Suppl 8):S5. doi: 10.1186/1471-2105-11-S8-S5.
9
Low-Rank Deep Convolutional Neural Network for Multitask Learning.低秩深度卷积神经网络的多任务学习
Comput Intell Neurosci. 2019 May 20;2019:7410701. doi: 10.1155/2019/7410701. eCollection 2019.
10
Multitask deep learning with dynamic task balancing for quantum mechanical properties prediction.用于量子力学性质预测的具有动态任务平衡的多任务深度学习。
Phys Chem Chem Phys. 2022 Mar 2;24(9):5383-5393. doi: 10.1039/d1cp05172e.

引用本文的文献

1
A Regularized Multi-Task Learning Approach for Cell Type Detection in Single-Cell RNA Sequencing Data.一种用于单细胞RNA测序数据中细胞类型检测的正则化多任务学习方法。
Front Genet. 2022 Apr 13;13:788832. doi: 10.3389/fgene.2022.788832. eCollection 2022.

本文引用的文献

1
scIGANs: single-cell RNA-seq imputation using generative adversarial networks.scIGANs:基于生成对抗网络的单细胞 RNA-seq 插补。
Nucleic Acids Res. 2020 Sep 4;48(15):e85. doi: 10.1093/nar/gkaa506.
2
Blood-based multi-tissue gene expression inference with Bayesian ridge regression.基于贝叶斯岭回归的血液多组织基因表达推断。
Bioinformatics. 2020 Jun 1;36(12):3788-3794. doi: 10.1093/bioinformatics/btaa239.
3
Network-based multi-task learning models for biomarker selection and cancer outcome prediction.基于网络的多任务学习模型用于生物标志物选择和癌症预后预测。
Bioinformatics. 2020 Mar 1;36(6):1814-1822. doi: 10.1093/bioinformatics/btz809.
4
GNE: a deep learning framework for gene network inference by aggregating biological information.GNE:一种通过整合生物信息进行基因网络推断的深度学习框架。
BMC Syst Biol. 2019 Apr 5;13(Suppl 2):38. doi: 10.1186/s12918-019-0694-y.
5
DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.DeepSynergy:运用深度学习预测抗癌药物协同作用。
Bioinformatics. 2018 May 1;34(9):1538-1546. doi: 10.1093/bioinformatics/btx806.
6
The Library of Integrated Network-Based Cellular Signatures NIH Program: System-Level Cataloging of Human Cells Response to Perturbations.集成网络细胞特征图谱 NIH 计划库:人类细胞对扰动反应的系统水平编目。
Cell Syst. 2018 Jan 24;6(1):13-24. doi: 10.1016/j.cels.2017.11.001. Epub 2017 Nov 29.
7
In Situ Transcription Profiling of Single Cells Reveals Spatial Organization of Cells in the Mouse Hippocampus.单细胞原位转录谱分析揭示小鼠海马体中细胞的空间组织
Neuron. 2016 Oct 19;92(2):342-357. doi: 10.1016/j.neuron.2016.10.001.
8
CellMapper: rapid and accurate inference of gene expression in difficult-to-isolate cell types.细胞映射器:在难以分离的细胞类型中快速准确地推断基因表达
Genome Biol. 2016 Sep 29;17(1):201. doi: 10.1186/s13059-016-1062-5.
9
DeepChrome: deep-learning for predicting gene expression from histone modifications.深度铬:用于从组蛋白修饰预测基因表达的深度学习
Bioinformatics. 2016 Sep 1;32(17):i639-i648. doi: 10.1093/bioinformatics/btw427.
10
70-Gene Signature as an Aid to Treatment Decisions in Early-Stage Breast Cancer.70 基因特征作为早期乳腺癌治疗决策的辅助手段。
N Engl J Med. 2016 Aug 25;375(8):717-29. doi: 10.1056/NEJMoa1602253.