• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

整合亚细胞定位以改进真核生物中远程同源性检测的机器学习模型。

Integrating subcellular location for improving machine learning models of remote homology detection in eukaryotic organisms.

作者信息

Shah Anuj R, Oehmen Christopher S, Harper Jill, Webb-Robertson Bobbie-Jo M

机构信息

Computational Biology & Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA.

出版信息

Comput Biol Chem. 2007 Apr;31(2):138-42. doi: 10.1016/j.compbiolchem.2007.02.012. Epub 2007 Feb 23.

DOI:10.1016/j.compbiolchem.2007.02.012
PMID:17416337
Abstract

A significant challenge in homology detection is to identify sequences that share a common evolutionary ancestor, despite significant primary sequence divergence. Remote homologs will often have less than 30% sequence identity, yet still retain common structural and functional properties. We demonstrate a novel method for identifying remote homologs using a support vector machine (SVM) classifier trained by fusing sequence similarity scores and subcellular location prediction. SVMs have been shown to perform well in a variety of applications where binary classification of data is the goal. At the same time, data fusion methods have been shown to be highly effective in enhancing discriminative power of data. Combining these two approaches in the application SVM-SimLoc resulted in identification of significantly more remote homologs (p-value<0.006) than using either sequence similarity or subcellular location independently.

摘要

同源性检测中的一个重大挑战是识别那些尽管一级序列存在显著差异,但却拥有共同进化祖先的序列。远源同源物通常序列同一性低于30%,但仍保留共同的结构和功能特性。我们展示了一种使用支持向量机(SVM)分类器识别远源同源物的新方法,该分类器通过融合序列相似性得分和亚细胞定位预测进行训练。在以数据二分类为目标的各种应用中,支持向量机已被证明表现良好。同时,数据融合方法已被证明在增强数据的判别力方面非常有效。在SVM-SimLoc应用中结合这两种方法,与单独使用序列相似性或亚细胞定位相比,能识别出显著更多的远源同源物(p值<0.006)。

相似文献

1
Integrating subcellular location for improving machine learning models of remote homology detection in eukaryotic organisms.整合亚细胞定位以改进真核生物中远程同源性检测的机器学习模型。
Comput Biol Chem. 2007 Apr;31(2):138-42. doi: 10.1016/j.compbiolchem.2007.02.012. Epub 2007 Feb 23.
2
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
3
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.
4
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
5
Improved prediction of subcellular location for apoptosis proteins by the dual-layer support vector machine.通过双层支持向量机改进凋亡蛋白亚细胞定位的预测
Amino Acids. 2008 Aug;35(2):383-8. doi: 10.1007/s00726-007-0608-y. Epub 2007 Dec 21.
6
Profile-based direct kernels for remote homology detection and fold recognition.用于远程同源性检测和折叠识别的基于轮廓的直接内核。
Bioinformatics. 2005 Dec 1;21(23):4239-47. doi: 10.1093/bioinformatics/bti687. Epub 2005 Sep 27.
7
Support vector machines with profile-based kernels for remote protein homology detection.用于远程蛋白质同源性检测的基于轮廓核的支持向量机。
Genome Inform. 2004;15(2):191-200.
8
Remote protein homology detection and fold recognition using two-layer support vector machine classifiers.使用两层支持向量机分类器进行远程蛋白质同源检测和折叠识别。
Comput Biol Med. 2011 Aug;41(8):687-99. doi: 10.1016/j.compbiomed.2011.06.004. Epub 2011 Jun 25.
9
Protein remote homology detection based on auto-cross covariance transformation.基于自交协方差变换的蛋白质远程同源检测。
Comput Biol Med. 2011 Aug;41(8):640-7. doi: 10.1016/j.compbiomed.2011.05.015. Epub 2011 Jun 12.
10
A structural alignment kernel for protein structures.一种用于蛋白质结构的结构比对核。
Bioinformatics. 2007 May 1;23(9):1090-8. doi: 10.1093/bioinformatics/btl642. Epub 2007 Jan 18.

引用本文的文献

1
Computational prediction of type III and IV secreted effectors in gram-negative bacteria.计算预测革兰氏阴性菌中的 III 型和 IV 型分泌效应子。
Infect Immun. 2011 Jan;79(1):23-32. doi: 10.1128/IAI.00537-10. Epub 2010 Oct 25.