Suppr超能文献

使用机器学习进行序列比对以实现基于模板的准确蛋白质结构预测。

Sequence Alignment Using Machine Learning for Accurate Template-based Protein Structure Prediction.

作者信息

Makigaki Shuichiro, Ishida Takashi

机构信息

School of Computing, Tokyo Institute of Technology, Tokyo, Japan.

出版信息

Bio Protoc. 2020 May 5;10(9):e3600. doi: 10.21769/BioProtoc.3600.

Abstract

Template-based modeling, the process of predicting the tertiary structure of a protein by using homologous protein structures, is useful when good templates can be available. Indeed, modern homology detection methods can find remote homologs with high sensitivity. However, the accuracy of template-based models generated from the homology-detection-based alignments is often lower than that from ideal alignments. In this study, we propose a new method that generates pairwise sequence alignments for more accurate template-based modeling. Our method trains a machine learning model using the structural alignment of known homologs. When calculating sequence alignments, instead of a fixed substitution matrix, this method dynamically predicts a substitution score from the trained model.

摘要

基于模板的建模,即通过使用同源蛋白质结构预测蛋白质三级结构的过程,在有良好模板可用时非常有用。实际上,现代同源性检测方法能够以高灵敏度找到远缘同源物。然而,基于同源性检测比对生成的基于模板的模型的准确性通常低于理想比对生成的模型。在本研究中,我们提出了一种新方法,该方法生成成对序列比对以进行更准确的基于模板的建模。我们的方法使用已知同源物的结构比对来训练机器学习模型。在计算序列比对时,该方法不是使用固定的替换矩阵,而是从训练模型动态预测替换分数。

相似文献

1
2
Sequence alignment using machine learning for accurate template-based protein structure prediction.
Bioinformatics. 2020 Jan 1;36(1):104-111. doi: 10.1093/bioinformatics/btz483.
3
Sequence alignment generation using intermediate sequence search for homology modeling.
Comput Struct Biotechnol J. 2020 Jul 25;18:2043-2050. doi: 10.1016/j.csbj.2020.07.012. eCollection 2020.
4
Using structure to explore the sequence alignment space of remote homologs.
PLoS Comput Biol. 2011 Oct;7(10):e1002175. doi: 10.1371/journal.pcbi.1002175. Epub 2011 Oct 6.
5
On the accuracy of homology modeling and sequence alignment methods applied to membrane proteins.
Biophys J. 2006 Jul 15;91(2):508-17. doi: 10.1529/biophysj.106.082313. Epub 2006 Apr 28.
6
Detecting distant-homology protein structures by aligning deep neural-network based contact maps.
PLoS Comput Biol. 2019 Oct 17;15(10):e1007411. doi: 10.1371/journal.pcbi.1007411. eCollection 2019 Oct.
8
SFESA: a web server for pairwise alignment refinement by secondary structure shifts.
BMC Bioinformatics. 2015 Sep 3;16(1):282. doi: 10.1186/s12859-015-0711-0.
9
Structure-dependent sequence alignment for remotely related proteins.
Bioinformatics. 2002 Dec;18(12):1658-65. doi: 10.1093/bioinformatics/18.12.1658.
10
Refinement by shifting secondary structure elements improves sequence alignments.
Proteins. 2015 Mar;83(3):411-27. doi: 10.1002/prot.24746. Epub 2015 Jan 13.

引用本文的文献

1
Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq.
BMC Bioinformatics. 2024 Mar 12;25(1):109. doi: 10.1186/s12859-024-05728-3.
2
Protein structure prediction based on particle swarm optimization and tabu search strategy.
BMC Bioinformatics. 2022 Aug 23;23(Suppl 10):352. doi: 10.1186/s12859-022-04888-4.

本文引用的文献

1
Sequence alignment using machine learning for accurate template-based protein structure prediction.
Bioinformatics. 2020 Jan 1;36(1):104-111. doi: 10.1093/bioinformatics/btz483.
2
Protein Data Bank: the single global archive for 3D macromolecular structure data.
Nucleic Acids Res. 2019 Jan 8;47(D1):D520-D528. doi: 10.1093/nar/gky949.
3
Homology modeling in drug discovery: Overview, current applications, and future perspectives.
Chem Biol Drug Des. 2019 Jan;93(1):12-20. doi: 10.1111/cbdd.13388. Epub 2018 Oct 8.
4
A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core.
J Mol Biol. 2018 Jul 20;430(15):2237-2243. doi: 10.1016/j.jmb.2017.12.007. Epub 2017 Dec 16.
5
SVMQA: support-vector-machine-based protein single-model quality assessment.
Bioinformatics. 2017 Aug 15;33(16):2496-2503. doi: 10.1093/bioinformatics/btx222.
6
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.
7
Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition.
Int J Mol Sci. 2016 Dec 16;17(12):2118. doi: 10.3390/ijms17122118.
8
DeepQA: improving the estimation of single protein model quality with deep belief networks.
BMC Bioinformatics. 2016 Dec 5;17(1):495. doi: 10.1186/s12859-016-1405-y.
9
UniProt: the universal protein knowledgebase.
Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.
10
Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.
Sci Rep. 2016 Jan 11;6:18962. doi: 10.1038/srep18962.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验