Suppr超能文献

当前蛋白质分类工具在识别潜在细菌脂肪酶序列中的脂肪分解特征方面存在的局限性。

The limitations of the current protein classification tools in identifying lipolytic features in putative bacterial lipase sequences.

机构信息

School of Engineering, Newcastle University, Newcastle-upon-Tyne NE1 7RU, UK.

School of Engineering, Newcastle University, Newcastle-upon-Tyne NE1 7RU, UK.

出版信息

J Biotechnol. 2022 Jun 10;351:30-37. doi: 10.1016/j.jbiotec.2022.04.011. Epub 2022 May 4.

Abstract

Metagenomics sequencing has generated millions of new protein sequences, most of them with unknown functions. A relatively quick first step for function assignment is to use the existing public protein databases and their scanning tools. However, to date these tools are not able to identify all sequence features like conserved motifs or patterns. In this study we evaluated the capability of several protein public databases (e.g., InterPro, PROSITE, ESTHER, pfam, AlphaFold etc) and their scanning tools for identifying lipolytic features in 78 putative cold-adapted bacterial lipase sequences. Novel lipases that can tolerate extreme conditions have great biotechnological importance. We obtained the putative cold-adapted lipolytic sequences from the metagenomic study of anaerobic psychrophilic microbial community treating domestic wastewater at 4 and 15 ℃. Both newer and conventional protein classifiers failed to find lipolytic features for most of the putative lipases. InterProScan predicted lipase family membership for only 18 of the putative lipase sequences. For more than half of them (41 out of 78) InterProScan could not predict any protein family membership, let alone find lipolytic features in them. However, when the Lipase Engineering Database and AlphaFold were used, half of those sequences were classified. Conventional databases like PROSITE could find lipolytic patterns for 9 of the putative lipolytic sequences of which only one was identified by InterProScan as a lipase. Moreover, different scanning tools made different and inconsistent predictions for a certain putative lipase sequence. Even InterProScan, which integrates predictions from 13 protein member databases, did not have a consensus prediction for a certain lipase sequence. Our study shows that there is lack of information in public protein databases about bacterial lipase sequences and this limits their lipolytic feature prediction and biotechnological application. The integration of AlphaFold within the InterPro can improve the lipase identification and classification significantly.

摘要

宏基因组测序产生了数以百万计的新蛋白质序列,其中大多数功能未知。功能分配的一个相对快速的初步步骤是使用现有的公共蛋白质数据库及其扫描工具。然而,迄今为止,这些工具还无法识别所有序列特征,如保守基序或模式。在这项研究中,我们评估了几种蛋白质公共数据库(例如 InterPro、PROSITE、ESTHER、pfam、AlphaFold 等)及其扫描工具,用于识别 78 个假定的冷适应细菌脂肪酶序列中的脂肪酶特征。能够耐受极端条件的新型脂肪酶具有重要的生物技术意义。我们从 4 和 15°C 下处理生活污水的厌氧嗜冷微生物群落的宏基因组研究中获得了假定的冷适应脂肪酶序列。较新的和传统的蛋白质分类器都未能为大多数假定的脂肪酶找到脂肪酶特征。InterProScan 仅预测了 18 个假定脂肪酶序列的脂肪酶家族成员。对于超过一半的序列(78 个中的 41 个),InterProScan 无法预测任何蛋白质家族成员,更不用说在其中找到脂肪酶特征了。然而,当使用 Lipase Engineering Database 和 AlphaFold 时,其中一半的序列被分类。传统的数据库,如 PROSITE,可以找到 9 个假定的脂肪酶序列中的脂肪酶模式,其中只有一个被 InterProScan 识别为脂肪酶。此外,不同的扫描工具对某个假定的脂肪酶序列做出了不同且不一致的预测。甚至 InterProScan,它整合了来自 13 个蛋白质成员数据库的预测,也没有对某个脂肪酶序列达成一致的预测。我们的研究表明,公共蛋白质数据库中缺乏关于细菌脂肪酶序列的信息,这限制了它们的脂肪酶特征预测和生物技术应用。AlphaFold 与 InterPro 的整合可以显著提高脂肪酶的识别和分类。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验