Jez Joseph M
Department of Biology, Washington University in St. Louis, One Brookings Drive, CB1137 St. Louis, MO 63130, United States.
J Invertebr Pathol. 2017 Jan;142:11-15. doi: 10.1016/j.jip.2016.07.013. Epub 2016 Jul 30.
The expansion of genomic data, three-dimensional structures of proteins, and computing power continues to improve our understanding of the evolution of protein structure and function relationships. As of June 2016, publically available databases contain more than 60 million unique protein sequences that group into 16,295 protein families that adopt ∼1400 different three-dimensional folds. This data supports the exploration of evolutionary relationships on protein structure and function to answer a basic question - how do changes in gene sequence lead to alterations in protein structure and to the tailoring of biological and chemical function? This mini-review aims to provide a primer on the basics of protein structure, how evolution of sequence leads to diversity in protein structure and function, how these changes occur, and the role of domains in protein evolution. Understanding how to use the vast amount of sequence and structural information may also aid in assessing if changes in protein sequence and/or structure are relevant for safety assessments of new commercial biotechnology products.
基因组数据、蛋白质三维结构以及计算能力的不断扩展,持续提升着我们对蛋白质结构与功能关系演变的理解。截至2016年6月,公开可用的数据库包含超过6000万个独特的蛋白质序列,这些序列归为16295个蛋白质家族,呈现出约1400种不同的三维折叠形式。这些数据有助于探索蛋白质结构与功能的进化关系,以回答一个基本问题——基因序列的变化如何导致蛋白质结构的改变以及生物和化学功能的定制?本综述旨在提供有关蛋白质结构基础、序列进化如何导致蛋白质结构和功能多样性、这些变化如何发生以及结构域在蛋白质进化中的作用的入门知识。了解如何利用大量的序列和结构信息,也可能有助于评估蛋白质序列和/或结构的变化是否与新型商业生物技术产品的安全性评估相关。