• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ProtDet-CCH:结合长短时记忆和排序方法的蛋白质远程同源检测。

ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1203-1210. doi: 10.1109/TCBB.2018.2789880. Epub 2018 Jan 5.

DOI:10.1109/TCBB.2018.2789880
PMID:29993950
Abstract

As one of the most challenging tasks in sequence analysis, protein remote homology detection has been extensively studied. Methods based on discriminative models and ranking approaches have achieved the state-of-the-art performance, and these two kinds of methods are complementary. In this study, three LSTM models have been applied to construct the predictors for protein remote homology detection, including ULSTM, BLSTM, and CNN-BLSTM. They are able to automatically extract the local and global sequence order information. Combined with PSSMs, the CNN-BLSTM achieved the best performance among the three LSTM-based models. We named this method as CNN-BLSTM-PSSM. Finally, a new method called ProtDet-CCH was proposed by combining CNN-BLSTM-PSSM and a ranking method HHblits. Tested on a widely used SCOP benchmark dataset, ProtDet-CCH achieved an ROC score of 0.998, and an ROC50 score of 0.982, significantly outperforming other existing state-of-the-art methods. Experimental results on two updated SCOPe independent datasets showed that ProtDet-CCH can achieve stable performance. Furthermore, our method can provide useful insights for studying the features and motifs of protein families and superfamilies. It is anticipated that ProtDet-CCH will become a very useful tool for protein remote homology detection.

摘要

作为序列分析中最具挑战性的任务之一,蛋白质远程同源检测得到了广泛的研究。基于判别模型和排序方法的方法已经达到了最新的性能水平,这两种方法是互补的。在这项研究中,我们应用了三个 LSTM 模型来构建蛋白质远程同源检测的预测器,包括 ULSTM、BLSTM 和 CNN-BLSTM。它们能够自动提取局部和全局序列顺序信息。与 PSSMs 结合,CNN-BLSTM 在基于 LSTM 的三种模型中表现最好。我们将这种方法命名为 CNN-BLSTM-PSSM。最后,我们提出了一种新的方法 ProtDet-CCH,它结合了 CNN-BLSTM-PSSM 和排序方法 HHblits。在广泛使用的 SCOP 基准数据集上进行测试,ProtDet-CCH 的 ROC 得分达到 0.998,ROC50 得分达到 0.982,明显优于其他现有的最新方法。在两个更新的 SCOPe 独立数据集上的实验结果表明,ProtDet-CCH 可以实现稳定的性能。此外,我们的方法可以为研究蛋白质家族和超家族的特征和基序提供有用的见解。预计 ProtDet-CCH 将成为蛋白质远程同源检测非常有用的工具。

相似文献

1
ProtDet-CCH: Protein Remote Homology Detection by Combining Long Short-Term Memory and Ranking Methods.ProtDet-CCH:结合长短时记忆和排序方法的蛋白质远程同源检测。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1203-1210. doi: 10.1109/TCBB.2018.2789880. Epub 2018 Jan 5.
2
Protein remote homology detection based on bidirectional long short-term memory.基于双向长短期记忆的蛋白质远程同源性检测
BMC Bioinformatics. 2017 Oct 10;18(1):443. doi: 10.1186/s12859-017-1842-2.
3
Application of learning to rank to protein remote homology detection.学习排序在蛋白质远程同源检测中的应用。
Bioinformatics. 2015 Nov 1;31(21):3492-8. doi: 10.1093/bioinformatics/btv413. Epub 2015 Jul 10.
4
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
5
dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation.dRHP-PseRA:基于轮廓的伪蛋白质序列和排序聚合检测远程同源蛋白质。
Sci Rep. 2016 Sep 1;6:32333. doi: 10.1038/srep32333.
6
Reducing dimensionality in remote homology detection using predicted contact maps.利用预测的接触图降低远程同源性检测中的维度
Comput Biol Med. 2015 Apr;59:64-72. doi: 10.1016/j.compbiomed.2015.01.020. Epub 2015 Jan 31.
7
Protein remote homology detection based on auto-cross covariance transformation.基于自交协方差变换的蛋白质远程同源检测。
Comput Biol Med. 2011 Aug;41(8):640-7. doi: 10.1016/j.compbiomed.2011.05.015. Epub 2011 Jun 12.
8
Protein Remote Homology Detection and Fold Recognition Based on Sequence-Order Frequency Matrix.基于序列-序位频率矩阵的蛋白质远程同源检测和折叠识别。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):292-300. doi: 10.1109/TCBB.2017.2765331. Epub 2017 Oct 23.
9
Remote protein homology detection and fold recognition using two-layer support vector machine classifiers.使用两层支持向量机分类器进行远程蛋白质同源检测和折叠识别。
Comput Biol Med. 2011 Aug;41(8):687-99. doi: 10.1016/j.compbiomed.2011.06.004. Epub 2011 Jun 25.
10
Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection.结合频率谱中提取的进化信息与基于序列的核函数进行蛋白质远程同源检测。
Bioinformatics. 2014 Feb 15;30(4):472-9. doi: 10.1093/bioinformatics/btt709. Epub 2013 Dec 5.

引用本文的文献

1
PreTP-2L: identification of therapeutic peptides and their types using two-layer ensemble learning framework.PreTP-2L:使用两层集成学习框架识别治疗性肽及其类型。
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad125.
2
AI-Powered Blockchain Technology for Public Health: A Contemporary Review, Open Challenges, and Future Research Directions.用于公共卫生的人工智能驱动的区块链技术:当代综述、公开挑战及未来研究方向
Healthcare (Basel). 2022 Dec 27;11(1):81. doi: 10.3390/healthcare11010081.
3
Computational analysis and prediction of PE_PGRS proteins using machine learning.
利用机器学习对PE_PGRS蛋白进行计算分析和预测。
Comput Struct Biotechnol J. 2022 Jan 22;20:662-674. doi: 10.1016/j.csbj.2022.01.019. eCollection 2022.
4
RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins.RFPR-IDP:通过同时纳入完全有序的蛋白质和无序的蛋白质,降低内在无序蛋白质和区域预测的假阳性率。
Brief Bioinform. 2021 Mar 22;22(2):2000-2011. doi: 10.1093/bib/bbaa018.
5
k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification.k跳n元语法随机森林法:一种基于随机森林的阿尔茨海默病蛋白质识别方法。
Front Genet. 2019 Feb 12;10:33. doi: 10.3389/fgene.2019.00033. eCollection 2019.
6
Gene2vec: gene subsequence embedding for prediction of mammalian -methyladenosine sites from mRNA.Gene2vec:基于基因子序列的嵌体模型,用于从 mRNA 预测哺乳动物 m6A 修饰位点。
RNA. 2019 Feb;25(2):205-218. doi: 10.1261/rna.069112.118. Epub 2018 Nov 13.
7
Feature extraction method for proteins based on Markov tripeptide by compressive sensing.基于压缩感知的 Markov 三肽蛋白质特征提取方法。
BMC Bioinformatics. 2018 Jun 18;19(1):229. doi: 10.1186/s12859-018-2235-x.