• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度学习助力基因表达的快速准确插补。

Deep Learning Enables Fast and Accurate Imputation of Gene Expression.

作者信息

Viñas Ramon, Azevedo Tiago, Gamazon Eric R, Liò Pietro

机构信息

Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom.

Vanderbilt Genetics Institute and Data Science Institute, VUMC, Nashville, TN, United States.

出版信息

Front Genet. 2021 Apr 13;12:624128. doi: 10.3389/fgene.2021.624128. eCollection 2021.

DOI:10.3389/fgene.2021.624128
PMID:33927746
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8076954/
Abstract

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

摘要

一个具有根本生物学意义的问题是,基因子集的表达在多大程度上可用于恢复完整的转录组,这对生物学发现和临床应用具有重要意义。为应对这一挑战,我们提出了两种用于基因表达插补的新型深度学习方法,即PMI和GAIN-GTEx。为了提高我们方法的适用性,我们利用了GTEx v8的数据,这是一个参考资源,它从各种人类组织中生成了全面的转录组集合。在两个案例研究和两种插补场景中,我们表明,在预测性能和运行时间方面,我们的方法优于几种标准和最新的插补方法。在对蛋白质编码基因进行的比较中,PMI在归纳插补方面表现出最高的性能,而GAIN-GTEx在原位插补方面优于其他方法。此外,我们的结果表明,在不同缺失水平的来自3种癌症类型的RNA-Seq数据上具有很强的泛化能力。我们的工作可以促进将大规模RNA生物样本库经济高效地整合到疾病基因组研究中,在各种组织类型中具有很高的适用性。

相似文献

1
Deep Learning Enables Fast and Accurate Imputation of Gene Expression.深度学习助力基因表达的快速准确插补。
Front Genet. 2021 Apr 13;12:624128. doi: 10.3389/fgene.2021.624128. eCollection 2021.
2
scMultiGAN: cell-specific imputation for single-cell transcriptomes with multiple deep generative adversarial networks.scMultiGAN:使用多个深度生成对抗网络进行单细胞转录组的细胞特异性插补。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad384.
3
Generative adversarial networks for imputing missing data for big data clinical research.生成对抗网络在大数据临床研究中用于填补缺失数据。
BMC Med Res Methodol. 2021 Apr 20;21(1):78. doi: 10.1186/s12874-021-01272-3.
4
scDTL: enhancing single-cell RNA-seq imputation through deep transfer learning with bulk cell information.scDTL:通过利用批量细胞信息进行深度迁移学习增强单细胞 RNA-seq 推断。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae555.
5
A novel missing data imputation approach based on clinical conditional Generative Adversarial Networks applied to EHR datasets.基于临床条件生成对抗网络的新型缺失数据插补方法在电子健康记录数据集的应用。
Comput Biol Med. 2023 Sep;163:107188. doi: 10.1016/j.compbiomed.2023.107188. Epub 2023 Jun 22.
6
DISC: a highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning.DISC:一种基于半监督深度学习的单细胞转录组基因表达和结构的高可扩展和准确推断方法。
Genome Biol. 2020 Jul 10;21(1):170. doi: 10.1186/s13059-020-02083-3.
7
Imputing Gene Expression in Uncollected Tissues Within and Beyond GTEx.推断GTEx内部和外部未采集组织中的基因表达。
Am J Hum Genet. 2016 Apr 7;98(4):697-708. doi: 10.1016/j.ajhg.2016.02.020. Epub 2016 Mar 31.
8
scGGAN: single-cell RNA-seq imputation by graph-based generative adversarial network.scGGAN:基于图的生成对抗网络的单细胞RNA测序数据插补
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad040.
9
Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data.利用公开的汇总水平数据,通过跨多个数据集的基因型进行转录组元推断。
PLoS Genet. 2022 Jan 31;18(1):e1009571. doi: 10.1371/journal.pgen.1009571. eCollection 2022 Jan.
10
Transforming L1000 profiles to RNA-seq-like profiles with deep learning.利用深度学习将 L1000 数据转化为 RNA-seq 数据。
BMC Bioinformatics. 2022 Sep 13;23(1):374. doi: 10.1186/s12859-022-04895-5.

引用本文的文献

1
A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications.一种基于 MissForest 的新的缺失值插补方法,在医学应用中采用递归特征消除。
BMC Med Res Methodol. 2024 Nov 8;24(1):269. doi: 10.1186/s12874-024-02392-2.
2
Deep Learning Methods for Omics Data Imputation.用于组学数据插补的深度学习方法。
Biology (Basel). 2023 Oct 7;12(10):1313. doi: 10.3390/biology12101313.
3
The impact of imputation quality on machine learning classifiers for datasets with missing values.插补质量对具有缺失值数据集的机器学习分类器的影响。

本文引用的文献

1
Adversarial generation of gene expression data.对抗生成基因表达数据。
Bioinformatics. 2022 Jan 12;38(3):730-737. doi: 10.1093/bioinformatics/btab035.
2
A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis.联合组织转录组全基因组关联和孟德尔随机化分析的统一框架。
Nat Genet. 2020 Nov;52(11):1239-1246. doi: 10.1038/s41588-020-0706-2. Epub 2020 Oct 5.
3
Organs-on-chips: into the next decade.芯片器官:迈向新的十年。
Commun Med (Lond). 2023 Oct 6;3(1):139. doi: 10.1038/s43856-023-00356-z.
4
Hypergraph factorization for multi-tissue gene expression imputation.用于多组织基因表达插补的超图分解
Nat Mach Intell. 2023 Jul;5(7):739-753. doi: 10.1038/s42256-023-00684-8. Epub 2023 Jul 17.
5
Bioinformatic analysis of the molecular mechanisms underlying the progression of bone defects.骨缺损进展潜在分子机制的生物信息学分析
Front Med (Lausanne). 2023 Jun 8;10:1157099. doi: 10.3389/fmed.2023.1157099. eCollection 2023.
6
MTM: a multi-task learning framework to predict individualized tissue gene expression profiles.MTM:一种用于预测个体化组织基因表达谱的多任务学习框架。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad363.
7
Incomplete time-series gene expression in integrative study for islet autoimmunity prediction.整合研究中不完全时间序列基因表达对胰岛自身免疫预测。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac537.
8
Multimodal Dimension Reduction and Subtype Classification of Head and Neck Squamous Cell Tumors.头颈部鳞状细胞肿瘤的多模态降维和亚型分类
Front Oncol. 2022 Jul 13;12:892207. doi: 10.3389/fonc.2022.892207. eCollection 2022.
9
Interpretable Autoencoders Trained on Single Cell Sequencing Data Can Transfer Directly to Data from Unseen Tissues.基于单细胞测序数据训练的可解释自动编码器可直接迁移到未见组织的数据。
Cells. 2021 Dec 28;11(1):85. doi: 10.3390/cells11010085.
Nat Rev Drug Discov. 2021 May;20(5):345-361. doi: 10.1038/s41573-020-0079-3. Epub 2020 Sep 10.
4
The GTEx Consortium atlas of genetic regulatory effects across human tissues.GTEx 联盟人类组织遗传调控效应图谱
Science. 2020 Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.
5
Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks.使用生成对抗网络对单细胞 RNA-seq 数据进行真实的模拟生成和扩充。
Nat Commun. 2020 Jan 9;11(1):166. doi: 10.1038/s41467-019-14018-z.
6
Clinically accurate diagnosis of Alzheimer's disease via multiplexed sensing of core biomarkers in human plasma.通过对人血浆中核心生物标志物的多重感测实现对阿尔茨海默病的临床精确诊断。
Nat Commun. 2020 Jan 8;11(1):119. doi: 10.1038/s41467-019-13901-z.
7
Artificial intelligence for global health.用于全球健康的人工智能
Science. 2019 Nov 22;366(6468):955-956. doi: 10.1126/science.aay5189.
8
Inferred divergent gene regulation in archaic hominins reveals potential phenotypic differences.在古人类中推断出的趋异基因调控揭示了潜在的表型差异。
Nat Ecol Evol. 2019 Nov;3(11):1598-1606. doi: 10.1038/s41559-019-0996-x. Epub 2019 Oct 7.
9
Genetic analyses of diverse populations improves discovery for complex traits.对不同人群的遗传分析可提高复杂性状的发现能力。
Nature. 2019 Jun;570(7762):514-518. doi: 10.1038/s41586-019-1310-4. Epub 2019 Jun 19.
10
Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation.利用 44 个人类组织的基因调控图谱来研究复杂疾病和特征相关的变异。
Nat Genet. 2018 Jul;50(7):956-967. doi: 10.1038/s41588-018-0154-4. Epub 2018 Jun 28.