Suppr超能文献

NIFtHool:一种使用深度神经网络鉴定 NifH 蛋白的信息学程序。

NIFtHool: an informatics program for identification of NifH proteins using deep neural networks.

机构信息

Escuela de Ciencias Biológicas e Ingeniería, Universidad de Investigación de Tecnología Experimental Yachay, Urcuquí, Imbabura, 100115, Ecuador.

出版信息

F1000Res. 2022 Feb 9;11:164. doi: 10.12688/f1000research.107925.1. eCollection 2022.

Abstract

Atmospheric nitrogen fixation carried out by microorganisms has environmental and industrial importance, related to the increase of soil fertility and productivity. The present work proposes the development of a new high precision system that allows the recognition of amino acid sequences of the nitrogenase enzyme (NifH) as a promising way to improve the identification of diazotrophic bacteria. For this purpose, a database obtained from UniProt built a processed dataset formed by a set of 4911 and 4782 amino acid sequences of the NifH and non-NifH proteins respectively. Subsequently, the feature extraction was developed using two methodologies: (i) k-mers counting and (ii) embedding layers to obtain numerical vectors of the amino acid chains. Afterward, for the embedding layer, the data was crossed by an external trainable convolutional layer, which received a uniform matrix and applied convolution using filters to obtain the feature maps of the model. Finally, a deep neural network was used as the primary model to classify the amino acid sequences as NifH protein or not. Performance evaluation experiments were carried out, and the results revealed an accuracy of 96.4%, a sensitivity of 95.2%, and a specificity of 96.7%. Therefore, an amino acid sequence-based feature extraction method that uses a neural network to detect N-fixing organisms is proposed and implemented. NIFtHool is available from: https://nifthool.anvil.app/.

摘要

大气中的氮固定由微生物完成,具有环境和工业重要性,与土壤肥力和生产力的提高有关。本工作提出开发一种新的高精度系统,该系统可以识别固氮酶(NifH)的氨基酸序列,作为提高固氮细菌识别的有前途的方法。为此,从 UniProt 获得的数据库构建了一个经过处理的数据集,该数据集由分别为 4911 和 4782 个 NifH 和非 NifH 蛋白的氨基酸序列组成。随后,使用两种方法(i)k-mer 计数和(ii)嵌入层来开发特征提取,以分别获得氨基酸链的数字向量。此后,对于嵌入层,通过外部可训练的卷积层来交叉数据,该卷积层接收一个均匀矩阵并使用滤波器进行卷积,以获得模型的特征图。最后,使用深度神经网络作为主要模型对氨基酸序列进行分类,以确定它们是否为 NifH 蛋白。进行了性能评估实验,结果表明准确率为 96.4%,灵敏度为 95.2%,特异性为 96.7%。因此,提出并实现了一种基于氨基酸序列的特征提取方法,该方法使用神经网络来检测固氮生物。NIFtHool 可从:https://nifthool.anvil.app/ 获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验