Sealy Center for Structural Biology and Molecular Biophysics, Department of Biochemistry and Molecular Biology, University of Texas, Medical Branch, Galveston, TX, 77555-0304, USA.
Foundation for Applied Molecular Evolution, Inc., Alachua, FL, 32615-9495, USA.
Sci Rep. 2017 Oct 24;7(1):13940. doi: 10.1038/s41598-017-13957-1.
Proteins are fundamental to life and exhibit a wide diversity of activities, some of which are toxic. Therefore, assessing whether a specific protein is safe for consumption in foods and feeds is critical. Simple BLAST searches may reveal homology to a known toxin, when in fact the protein may pose no real danger. Another challenge to answer this question is the lack of curated databases with a representative set of experimentally validated toxins. Here we have systematically analyzed over 10,000 manually curated toxin sequences using sequence clustering, network analysis, and protein domain classification. We also developed a functional sequence signature method to distinguish toxic from non-toxic proteins. The current database, combined with motif analysis, can be used by researchers and regulators in a hazard screening capacity to assess the potential of a protein to be toxic at early stages of development. Identifying key signatures of toxicity can also aid in redesigning proteins, so as to maintain their desirable functions while reducing the risk of potential health hazards.
蛋白质是生命的基础,具有广泛多样的活性,其中一些是有毒的。因此,评估特定蛋白质是否可安全用于食品和饲料是至关重要的。简单的 BLAST 搜索可能会发现与已知毒素的同源性,但实际上该蛋白质可能不会带来真正的危险。另一个需要回答的问题是缺乏经过精心整理的数据库,其中包含经过实验验证的代表性毒素集。在这里,我们使用序列聚类、网络分析和蛋白质结构域分类等方法,对超过 10000 条经过人工整理的毒素序列进行了系统分析。我们还开发了一种功能序列特征方法,用于区分有毒和无毒蛋白质。目前的数据库结合基序分析,可以为研究人员和监管机构提供危害筛选能力,以在开发的早期阶段评估蛋白质有毒的可能性。识别毒性的关键特征也有助于重新设计蛋白质,在保持其理想功能的同时降低潜在健康危害的风险。