• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于可变长度基序检测和基于差异的分类对蛋白质亚细胞定位进行预测。

Prediction of protein subcellular localization based on variable-length motifs detection and dissimilarity based classification.

作者信息

Arango-Argoty G A, Jaramillo-Garzón J A, Röthlisberger S, Castellanos-Dominguez C G

机构信息

Signal Processing and Recognition Group, Universidad Nacionalde Colombia, Campus La Nubia, Magdalena, Colombia.

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:945-8. doi: 10.1109/IEMBS.2011.6090213.

DOI:10.1109/IEMBS.2011.6090213
PMID:22254467
Abstract

Predict the function of unknown proteins is one of the principal goals in computational biology. The subcellular localization of a protein allows further understanding its structure and molecular function. Numerous prediction techniques have been developed, usually focusing on global information of the protein. But, predictions can be done through the identification of functional sub-sequence patterns known as motifs. For motifs discovery problem, many methods requires a predefined fixed window size in advance and aligned sequences. To confront these problems we proposed a method based on variable length motifs characterization and detection using the continuous wavelet transform (CWT) and a dissimilarity space representation. For analyzing the motifs results generated by our approach, we divide the entire dataset into training (60%) and validation (40%). A Support Vector Machine (SVM) classifier is used as predictor for validation set. The highest Sn = 82.58% and Sp = 92.86%, across 10-fold cross validation, is obtained for endosome proteins. Average results Sn = 74% and Sp = 75.58% are comparable to current state of the art. For data sets whose identity is low (< 40%), the motifs characterization and localization based on CWT shows a good performance and the interpretability of the subsequences in each subcellular localization.

摘要

预测未知蛋白质的功能是计算生物学的主要目标之一。蛋白质的亚细胞定位有助于进一步了解其结构和分子功能。已经开发了许多预测技术,通常侧重于蛋白质的全局信息。但是,可以通过识别称为基序的功能性子序列模式来进行预测。对于基序发现问题,许多方法需要预先定义固定的窗口大小和比对序列。为了解决这些问题,我们提出了一种基于可变长度基序表征和检测的方法,该方法使用连续小波变换(CWT)和差异空间表示。为了分析我们的方法生成的基序结果,我们将整个数据集分为训练集(60%)和验证集(40%)。支持向量机(SVM)分类器用作验证集的预测器。对于内体蛋白,在10折交叉验证中获得的最高灵敏度(Sn)= 82.58%,特异度(Sp)= 92.86%。平均结果Sn = 74%,Sp = 75.58%与当前的技术水平相当。对于同一性较低(< 40%)的数据集,基于CWT的基序表征和定位显示出良好的性能以及每个亚细胞定位中后续序列的可解释性。

相似文献

1
Prediction of protein subcellular localization based on variable-length motifs detection and dissimilarity based classification.基于可变长度基序检测和基于差异的分类对蛋白质亚细胞定位进行预测。
Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:945-8. doi: 10.1109/IEMBS.2011.6090213.
2
PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM.PairProSVM:基于局部两两轮廓比对和支持向量机的蛋白质亚细胞定位
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):416-22. doi: 10.1109/TCBB.2007.70256.
3
Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.Hum-PLoc:一种用于预测人类蛋白质亚细胞定位的新型集成分类器。
Biochem Biophys Res Commun. 2006 Aug 18;347(1):150-7. doi: 10.1016/j.bbrc.2006.06.059. Epub 2006 Jun 21.
4
MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition.MultiLoc:利用N端靶向序列、序列基序和氨基酸组成预测蛋白质亚细胞定位
Bioinformatics. 2006 May 15;22(10):1158-65. doi: 10.1093/bioinformatics/btl002. Epub 2006 Jan 20.
5
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
6
ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.ProLoc-GO:利用信息丰富的基因本体术语进行基于序列的蛋白质亚细胞定位预测。
BMC Bioinformatics. 2008 Feb 1;9:80. doi: 10.1186/1471-2105-9-80.
7
Subcellular localization prediction with new protein encoding schemes.采用新蛋白质编码方案的亚细胞定位预测
IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):227-32. doi: 10.1109/TCBB.2007.070209.
8
Prediction of protein subcellular locations using a new measure of information discrepancy.使用一种新的信息差异度量来预测蛋白质亚细胞定位。
J Bioinform Comput Biol. 2005 Aug;3(4):915-27. doi: 10.1142/s0219720005001399.
9
Protein subcellular localization prediction based on compartment-specific features and structure conservation.基于特定区室特征和结构保守性的蛋白质亚细胞定位预测
BMC Bioinformatics. 2007 Sep 8;8:330. doi: 10.1186/1471-2105-8-330.
10
Implicit motif distribution based hybrid computational kernel for sequence classification.基于隐式基序分布的混合计算内核用于序列分类。
Bioinformatics. 2005 Apr 15;21(8):1429-36. doi: 10.1093/bioinformatics/bti212. Epub 2004 Dec 14.

引用本文的文献

1
Understanding molecular mechanisms of disease through spatial proteomics.通过空间蛋白质组学了解疾病的分子机制。
Curr Opin Chem Biol. 2019 Feb;48:19-25. doi: 10.1016/j.cbpa.2018.09.016. Epub 2018 Oct 9.