Suppr超能文献

神经肽预测器:一种从后生动物蛋白质组中鉴定神经肽前体的预测器。

NeuroPID: a predictor for identifying neuropeptide precursors from metazoan proteomes.

机构信息

Department of Biological Chemistry, Institute of Life Sciences, The Edmond J. Safra Campus, The Hebrew University of Jerusalem, Givat Ram 91904, Israel.

出版信息

Bioinformatics. 2014 Apr 1;30(7):931-40. doi: 10.1093/bioinformatics/btt725. Epub 2013 Dec 13.

Abstract

MOTIVATION

The evolution of multicellular organisms is associated with increasing variability of molecules governing behavioral and physiological states. This is often achieved by neuropeptides (NPs) that are produced in neurons from a longer protein, named neuropeptide precursor (NPP). The maturation of NPs occurs through a sequence of proteolytic cleavages. The difficulty in identifying NPPs is a consequence of their diversity and the lack of applicable sequence similarity among the short functionally related NPs.

RESULTS

Herein, we describe Neuropeptide Precursor Identifier (NeuroPID), a machine learning scheme that predicts metazoan NPPs. NeuroPID was trained on hundreds of identified NPPs from the UniProtKB database. Some 600 features were extracted from the primary sequences and processed using support vector machines (SVM) and ensemble decision tree classifiers. These features combined biophysical, chemical and informational-statistical properties of NPs and NPPs. Other features were guided by the defining characteristics of the dibasic cleavage sites motif. NeuroPID reached 89-94% accuracy and 90-93% precision in cross-validation blind tests against known NPPs (with an emphasis on Chordata and Arthropoda). NeuroPID also identified NPP-like proteins from extensively studied model organisms as well as from poorly annotated proteomes. We then focused on the most significant sets of features that contribute to the success of the classifiers. We propose that NPPs are attractive targets for investigating and modulating behavior, metabolism and homeostasis and that a rich repertoire of NPs remains to be identified.

AVAILABILITY

NeuroPID source code is freely available at http://www.protonet.cs.huji.ac.il/neuropid

摘要

动机

多细胞生物的进化与控制行为和生理状态的分子的可变性增加有关。这通常是通过神经肽(NPs)实现的,这些神经肽是由神经元从一种名为神经肽前体(NPP)的长蛋白中产生的。NPs 的成熟是通过一系列蛋白水解切割来实现的。识别 NPP 的困难是由于它们的多样性以及短的功能相关 NPs 之间缺乏适用的序列相似性所致。

结果

在此,我们描述了神经肽前体识别器(NeuroPID),这是一种预测后生动物 NPP 的机器学习方案。NeuroPID 是在 UniProtKB 数据库中数百个已识别的 NPP 上进行训练的。从一级序列中提取了约 600 个特征,并使用支持向量机(SVM)和集成决策树分类器进行处理。这些特征结合了 NPs 和 NPPs 的生物物理、化学和信息统计特性。其他特征则由双碱性切割位点基序的定义特征指导。NeuroPID 在针对已知 NPP 的交叉验证盲测试中达到了 89-94%的准确率和 90-93%的精度(重点是脊索动物和节肢动物)。NeuroPID 还从广泛研究的模式生物以及注释较差的蛋白质组中识别出了 NPP 样蛋白。然后,我们专注于对分类器成功贡献最大的特征集。我们提出 NPP 是研究和调节行为、代谢和内稳态的有吸引力的靶标,并且还有大量的 NPs 有待发现。

可用性

NeuroPID 源代码可在 http://www.protonet.cs.huji.ac.il/neuropid 上免费获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验