Suppr超能文献

FAVA:从 scRNA-seq 和蛋白质组学数据中推断出的高质量功能关联网络。

FAVA: high-quality functional association networks inferred from scRNA-seq and proteomics data.

机构信息

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark.

VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.

出版信息

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae010.

Abstract

MOTIVATION

Protein networks are commonly used for understanding how proteins interact. However, they are typically biased by data availability, favoring well-studied proteins with more interactions. To uncover functions of understudied proteins, we must use data that are not affected by this literature bias, such as single-cell RNA-seq and proteomics. Due to data sparseness and redundancy, functional association analysis becomes complex.

RESULTS

To address this, we have developed FAVA (Functional Associations using Variational Autoencoders), which compresses high-dimensional data into a low-dimensional space. FAVA infers networks from high-dimensional omics data with much higher accuracy than existing methods, across a diverse collection of real as well as simulated datasets. FAVA can process large datasets with over 0.5 million conditions and has predicted 4210 interactions between 1039 understudied proteins. Our findings showcase FAVA's capability to offer novel perspectives on protein interactions. FAVA functions within the scverse ecosystem, employing AnnData as its input source.

AVAILABILITY AND IMPLEMENTATION

Source code, documentation, and tutorials for FAVA are accessible on GitHub at https://github.com/mikelkou/fava. FAVA can also be installed and used via pip/PyPI as well as via the scverse ecosystem https://github.com/scverse/ecosystem-packages/tree/main/packages/favapy.

摘要

动机

蛋白质网络常用于理解蛋白质之间的相互作用。然而,它们通常受到数据可用性的影响,偏向于研究较多、相互作用较多的蛋白质。为了揭示研究较少的蛋白质的功能,我们必须使用不受文献偏差影响的数据,如单细胞 RNA-seq 和蛋白质组学。由于数据稀疏和冗余,功能关联分析变得复杂。

结果

为了解决这个问题,我们开发了 FAVA(使用变分自动编码器的功能关联),它将高维数据压缩到低维空间。FAVA 比现有方法从高维组学数据中推断网络的准确性要高得多,涵盖了真实和模拟数据集的多样化集合。FAVA 可以处理超过 0.5 万个条件的大型数据集,并预测了 1039 个研究较少的蛋白质之间的 4210 个相互作用。我们的研究结果展示了 FAVA 提供蛋白质相互作用新视角的能力。FAVA 在 scverse 生态系统中运行,使用 AnnData 作为其输入源。

可用性和实现

FAVA 的源代码、文档和教程可在 GitHub 上获得,网址为 https://github.com/mikelkou/fava。FAVA 也可以通过 pip/PyPI以及通过 scverse 生态系统 https://github.com/scverse/ecosystem-packages/tree/main/packages/favapy 进行安装和使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c4d/10868155/8dac9af2db85/btae010f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验