Suppr超能文献

AGeNNT:通过精细邻域网络对酶家族进行注释。

AGeNNT: annotation of enzyme families by means of refined neighborhood networks.

作者信息

Kandlinger Florian, Plach Maximilian G, Merkl Rainer

机构信息

Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040, Regensburg, Germany.

Faculty of Mathematics and Computer Science, University of Hagen, D-58084, Hagen, Germany.

出版信息

BMC Bioinformatics. 2017 May 25;18(1):274. doi: 10.1186/s12859-017-1689-6.

Abstract

BACKGROUND

Large enzyme families may contain functionally diverse members that give rise to clusters in a sequence similarity network (SSN). In prokaryotes, the genome neighborhood of a gene-product is indicative of its function and thus, a genome neighborhood network (GNN) deduced for an SSN provides strong clues to the specific function of enzymes constituting the different clusters. The Enzyme Function Initiative ( http://enzymefunction.org/ ) offers services that compute SSNs and GNNs.

RESULTS

We have implemented AGeNNT that utilizes these services, albeit with datasets purged with respect to unspecific protein functions and overrepresented species. AGeNNT generates refined GNNs (rGNNs) that consist of cluster-nodes representing the sequences under study and Pfam-nodes representing enzyme functions encoded in the respective neighborhoods. For cluster-nodes, AGeNNT summarizes the phylogenetic relationships of the contributing species and a statistic indicates how unique nodes and GNs are within this rGNN. Pfam-nodes are annotated with additional features like GO terms describing protein function. For edges, the coverage is given, which is the relative number of neighborhoods containing the considered enzyme function (Pfam-node). AGeNNT is available at https://github.com/kandlinf/agennt .

CONCLUSIONS

An rGNN is easier to interpret than a conventional GNN, which commonly contains proteins without enzymatic function and overly specific neighborhoods due to phylogenetic bias. The implemented filter routines and the statistic allow the user to identify those neighborhoods that are most indicative of a specific metabolic capacity. Thus, AGeNNT facilitates to distinguish and annotate functionally different members of enzyme families.

摘要

背景

大型酶家族可能包含功能多样的成员,这些成员在序列相似性网络(SSN)中形成簇。在原核生物中,基因产物的基因组邻域指示其功能,因此,为SSN推导的基因组邻域网络(GNN)为构成不同簇的酶的特定功能提供了有力线索。酶功能倡议组织(http://enzymefunction.org/)提供计算SSN和GNN的服务。

结果

我们实现了AGeNNT,它利用这些服务,不过使用的数据集已针对非特异性蛋白质功能和过度代表性物种进行了清理。AGeNNT生成精细的GNN(rGNN),其由代表所研究序列的簇节点和代表各自邻域中编码的酶功能的Pfam节点组成。对于簇节点,AGeNNT总结了贡献物种的系统发育关系,并且一个统计量表明节点和GN在这个rGNN中的独特程度。Pfam节点用额外的特征进行注释,如描述蛋白质功能的GO术语。对于边,给出了覆盖率,即包含所考虑的酶功能(Pfam节点)的邻域的相对数量。AGeNNT可在https://github.com/kandlinf/agennt获取。

结论

rGNN比传统的GNN更易于解释,传统GNN通常包含没有酶功能的蛋白质以及由于系统发育偏差而过于特定的邻域。所实现的过滤程序和统计量允许用户识别那些最能指示特定代谢能力的邻域。因此,AGeNNT有助于区分和注释酶家族中功能不同的成员。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef46/5445326/d9ba2ecabc3d/12859_2017_1689_Fig3_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验