Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland.
SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
Nucleic Acids Res. 2023 Jan 6;51(D1):D638-D646. doi: 10.1093/nar/gkac1000.
Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.org/) systematically collects and integrates protein-protein interactions-both physical interactions as well as functional associations. The data originate from a number of sources: automated text mining of the scientific literature, computational interaction predictions from co-expression, conserved genomic context, databases of interaction experiments and known complexes/pathways from curated sources. All of these interactions are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information. The data can be accessed via the website, but also programmatically and via bulk downloads. The most recent developments in STRING (version 12.0) are: (i) it is now possible to create, browse and analyze a full interaction network for any novel genome of interest, by submitting its complement of encoded proteins, (ii) the co-expression channel now uses variational auto-encoders to predict interactions, and it covers two new sources, single-cell RNA-seq and experimental proteomics data and (iii) the confidence in each experimentally derived interaction is now estimated based on the detection method used, and communicated to the user in the web-interface. Furthermore, STRING continues to enhance its facilities for functional enrichment analysis, which are now fully available also for user-submitted genomes.
细胞内的许多复杂性源于蛋白质之间的功能和调节相互作用。这些相互作用的核心内容越来越为人所知,但新的相互作用仍在不断被发现,而且这些信息仍然分散在不同的数据库资源、实验模式和机制细节水平上。STRING 数据库(https://string-db.org/)系统地收集和整合蛋白质-蛋白质相互作用,包括物理相互作用和功能关联。这些数据来源于多个来源:对科学文献的自动文本挖掘、共表达的计算相互作用预测、保守的基因组背景、相互作用实验数据库以及来自精心策划来源的已知复合物/途径。所有这些相互作用都经过严格评估、评分,随后使用层次同源信息自动转移到研究较少的生物体。可以通过网站访问数据,也可以通过编程和批量下载访问。STRING(版本 12.0)的最新进展包括:(i)现在可以通过提交其编码蛋白的互补物,为任何感兴趣的新基因组创建、浏览和分析完整的相互作用网络,(ii)共表达通道现在使用变分自动编码器来预测相互作用,并且涵盖了两个新来源,单细胞 RNA-seq 和实验蛋白质组学数据,(iii)现在可以根据使用的检测方法来估计每个实验得出的相互作用的置信度,并在网络界面中向用户传达。此外,STRING 继续增强其功能富集分析功能,现在也可完全用于用户提交的基因组。