Suppr超能文献

TM-Aligner:用于跨膜蛋白的多重序列比对工具,可减少时间并提高准确性。

TM-Aligner: Multiple sequence alignment tool for transmembrane proteins with reduced time and improved accuracy.

机构信息

Department of Life Science, Shiv Nadar University, Greater Noida, UP, 201314, India.

Department of Animal Biotechnology, Sher-e-Kashmir University of Agricultural Sciences and Technology, Shuhama, Jammu and Kashmir, 190016, India.

出版信息

Sci Rep. 2017 Oct 2;7(1):12543. doi: 10.1038/s41598-017-13083-y.

Abstract

Membrane proteins plays significant role in living cells. Transmembrane proteins are estimated to constitute approximately 30% of proteins at genomic scale. It has been a difficult task to develop specific alignment tools for transmembrane proteins due to limited number of experimentally validated protein structures. Alignment tools based on homology modeling provide fairly good result by recapitulating 70-80% residues in reference alignment provided all input sequences should have known template structures. However, homology modeling tools took substantial amount of time, thus aligning large numbers of sequences becomes computationally demanding. Here we present TM-Aligner, a new tool for transmembrane protein sequence alignment. TM-Aligner is based on Wu-Manber and dynamic string matching algorithm which has significantly improved its accuracy and speed of multiple sequence alignment. We compared TM-Aligner with prevailing other popular tools and performed benchmarking using three separate reference sets, BaliBASE3.0 reference set7 of alpha-helical transmembrane proteins, structure based alignment of transmembrane proteins from Pfam database and structure alignment from GPCRDB. Benchmarking against reference datasets indicated that TM-Aligner is more advanced method having least turnaround time with significant improvements over the most accurate methods such as PROMALS, MAFFT, TM-Coffee, Kalign, ClustalW, Muscle and PRALINE. TM-Aligner is freely available through http://lms.snu.edu.in/TM-Aligner/ .

摘要

膜蛋白在活细胞中起着重要作用。跨膜蛋白估计约占基因组规模蛋白质的 30%。由于实验验证的蛋白质结构数量有限,因此开发针对跨膜蛋白的特异性对齐工具一直是一项艰巨的任务。基于同源建模的对齐工具通过在提供的参考对齐中重新生成 70-80%的残基,提供了相当好的结果,前提是所有输入序列都应该具有已知的模板结构。然而,同源建模工具需要大量的时间,因此对齐大量序列在计算上变得具有挑战性。在这里,我们介绍了 TM-Aligner,这是一种用于跨膜蛋白序列对齐的新工具。TM-Aligner 基于 Wu-Manber 和动态字符串匹配算法,显著提高了其在参考对齐中重新生成 70-80%残基的准确性和速度,所有输入序列都应该具有已知的模板结构。然而,同源建模工具需要大量的时间,因此对齐大量序列在计算上变得具有挑战性。在这里,我们介绍了 TM-Aligner,这是一种用于跨膜蛋白序列对齐的新工具。TM-Aligner 基于 Wu-Manber 和动态字符串匹配算法,显著提高了其在参考对齐中重新生成 70-80%残基的准确性和速度,所有输入序列都应该具有已知的模板结构。然而,同源建模工具需要大量的时间,因此对齐大量序列在计算上变得具有挑战性。在这里,我们介绍了 TM-Aligner,这是一种用于跨膜蛋白序列对齐的新工具。TM-Aligner 基于 Wu-Manber 和动态字符串匹配算法,显著提高了其在参考对齐中重新生成 70-80%残基的准确性和速度,

我们将 TM-Aligner 与现有的其他流行工具进行了比较,并使用三个独立的参考数据集(BaliBASE3.0 参考集 7 的α-螺旋跨膜蛋白、来自 Pfam 数据库的基于结构的跨膜蛋白比对和来自 GPCRDB 的结构比对)进行了基准测试。与参考数据集的基准测试表明,TM-Aligner 是一种更先进的方法,具有最短的周转时间,并在最准确的方法(如 PROMALS、MAFFT、TM-Coffee、Kalign、ClustalW、Muscle 和 PRALINE)上取得了显著的改进。TM-Aligner 可通过 http://lms.snu.edu.in/TM-Aligner/ 免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4717/5624947/c2430c67b182/41598_2017_13083_Fig1_HTML.jpg

相似文献

2
Kalign--an accurate and fast multiple sequence alignment algorithm.
BMC Bioinformatics. 2005 Dec 12;6:298. doi: 10.1186/1471-2105-6-298.
4
Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee.
BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S1. doi: 10.1186/1471-2105-13-S4-S1.
5
PROMALS web server for accurate multiple protein sequence alignments.
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W649-52. doi: 10.1093/nar/gkm227. Epub 2007 Apr 22.
6
PROMALS: towards accurate multiple sequence alignments of distantly related proteins.
Bioinformatics. 2007 Apr 1;23(7):802-8. doi: 10.1093/bioinformatics/btm017. Epub 2007 Jan 31.
7
SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures.
Bioinformatics. 2005 Sep 15;21(18):3615-21. doi: 10.1093/bioinformatics/bti582. Epub 2005 Jul 14.
8
OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.
BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47.
9
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. Print 2004.
10
Grammar-based distance in progressive multiple sequence alignment.
BMC Bioinformatics. 2008 Jul 10;9:306. doi: 10.1186/1471-2105-9-306.

引用本文的文献

1
Computational drug development for membrane protein targets.
Nat Biotechnol. 2024 Feb;42(2):229-242. doi: 10.1038/s41587-023-01987-2. Epub 2024 Feb 15.
3
Disruption of the inositol phosphorylceramide synthase gene affects Trypanosoma cruzi differentiation and infection capacity.
PLoS Negl Trop Dis. 2023 Sep 20;17(9):e0011646. doi: 10.1371/journal.pntd.0011646. eCollection 2023 Sep.
4
Genome-wide analysis of PTR transporters in Candida species and their functional characterization in Candida auris.
Appl Microbiol Biotechnol. 2022 Jun;106(11):4223-4235. doi: 10.1007/s00253-022-11998-9. Epub 2022 Jun 1.
8
Changthangi Pashmina Goat Genome: Sequencing, Assembly, and Annotation.
Front Genet. 2021 Jul 20;12:695178. doi: 10.3389/fgene.2021.695178. eCollection 2021.
9
Refining pairwise sequence alignments of membrane proteins by the incorporation of anchors.
PLoS One. 2021 Apr 30;16(4):e0239881. doi: 10.1371/journal.pone.0239881. eCollection 2021.

本文引用的文献

2
The Pfam protein families database: towards a more sustainable future.
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.
3
GPCRDB: an information system for G protein-coupled receptors.
Nucleic Acids Res. 2014 Jan;42(Database issue):D422-5. doi: 10.1093/nar/gkt1255. Epub 2013 Dec 3.
4
PDBTM: Protein Data Bank of transmembrane proteins after 8 years.
Nucleic Acids Res. 2013 Jan;41(Database issue):D524-9. doi: 10.1093/nar/gks1169. Epub 2012 Nov 30.
5
Recent developments in the MAFFT multiple sequence alignment program.
Brief Bioinform. 2008 Jul;9(4):286-98. doi: 10.1093/bib/bbn013. Epub 2008 Mar 27.
6
PRALINETM: a strategy for improved multiple alignment of transmembrane proteins.
Bioinformatics. 2008 Feb 15;24(4):492-7. doi: 10.1093/bioinformatics/btm636. Epub 2008 Jan 2.
7
PROMALS: towards accurate multiple sequence alignments of distantly related proteins.
Bioinformatics. 2007 Apr 1;23(7):802-8. doi: 10.1093/bioinformatics/btm017. Epub 2007 Jan 31.
8
Kalign--an accurate and fast multiple sequence alignment algorithm.
BMC Bioinformatics. 2005 Dec 12;6:298. doi: 10.1186/1471-2105-6-298.
9
BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark.
Proteins. 2005 Oct 1;61(1):127-36. doi: 10.1002/prot.20527.
10
Homology-extended sequence alignment.
Nucleic Acids Res. 2005 Feb 7;33(3):816-24. doi: 10.1093/nar/gki233. Print 2005.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验