Department of Systems Biology, Columbia University, New York, New York, USA.
Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Department of Medicine, Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, New York, USA.
J Biol Chem. 2021 Jan-Jun;296:100562. doi: 10.1016/j.jbc.2021.100562. Epub 2021 Mar 18.
Systems biology is a data-heavy field that focuses on systems-wide depictions of biological phenomena necessarily sacrificing a detailed characterization of individual components. As an example, genome-wide protein interaction networks are widely used in systems biology and continuously extended and refined as new sources of evidence become available. Despite the vast amount of information about individual protein structures and protein complexes that has accumulated in the past 50 years in the Protein Data Bank, the data, computational tools, and language of structural biology are not an integral part of systems biology. However, increasing effort has been devoted to this integration, and the related literature is reviewed here. Relationships between proteins that are detected via structural similarity offer a rich source of information not available from sequence similarity, and homology modeling can be used to leverage Protein Data Bank structures to produce 3D models for a significant fraction of many proteomes. A number of structure-informed genomic and cross-species (i.e., virus-host) interactomes will be described, and the unique information they provide will be illustrated with a number of examples. Tissue- and tumor-specific interactomes have also been developed through computational strategies that exploit patient information and through genetic interactions available from increasingly sensitive screens. Strategies to integrate structural information with these alternate data sources will be described. Finally, efforts to link protein structure space with chemical compound space offer novel sources of information in drug design, off-target identification, and the identification of targets for compounds found to be effective in phenotypic screens.
系统生物学是一个数据密集型领域,专注于对生物现象进行全系统描述,因此必然会牺牲对单个组件的详细描述。例如,全基因组蛋白质相互作用网络在系统生物学中被广泛应用,并随着新证据来源的出现而不断扩展和完善。尽管在过去的 50 年中,蛋白质数据库(Protein Data Bank)积累了大量关于单个蛋白质结构和蛋白质复合物的信息,但结构生物学的数据、计算工具和语言并不是系统生物学的一个组成部分。然而,人们越来越致力于实现这种整合,这里对相关文献进行了综述。通过结构相似性检测到的蛋白质之间的关系提供了丰富的信息,这些信息是序列相似性无法提供的,同源建模可以用来利用蛋白质数据库结构为许多蛋白质组的很大一部分生成 3D 模型。将描述一些结构信息丰富的基因组和跨物种(即病毒-宿主)相互作用组,并通过一些例子来说明它们提供的独特信息。还通过利用患者信息和越来越敏感的筛选获得的遗传相互作用,通过计算策略开发了组织和肿瘤特异性相互作用组。将结构信息与这些替代数据源整合的策略也将被描述。最后,将蛋白质结构空间与化学化合物空间联系起来的努力为药物设计、脱靶识别以及鉴定在表型筛选中有效的化合物的靶标提供了新的信息来源。