采用层次聚类的多序列比对。

Multiple sequence alignment with hierarchical clustering.

作者信息

Corpet F

机构信息

Laboratoire de Génétique Cellulaire, INRA Toulouse, France.

出版信息

Nucleic Acids Res. 1988 Nov 25;16(22):10881-90. doi: 10.1093/nar/16.22.10881.

DOI:10.1093/nar/16.22.10881

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC338945/

Abstract

An algorithm is presented for the multiple alignment of sequences, either proteins or nucleic acids, that is both accurate and easy to use on microcomputers. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, a hierarchical clustering of the sequences is performed using the matrix of the pairwise alignment scores. The closest sequences are aligned creating groups of aligned sequences. Then close groups are aligned until all sequences are aligned in one group. The pairwise alignments included in the multiple alignment form a new matrix that is used to produce a hierarchical clustering. If it is different from the first one, iteration of the process can be performed. The method is illustrated by an example: a global alignment of 39 sequences of cytochrome c.

摘要

本文提出了一种用于蛋白质或核酸序列多重比对的算法，该算法准确且易于在微型计算机上使用。该方法基于传统的成对比对动态规划方法。首先，使用成对比对得分矩阵对序列进行层次聚类。将最相似的序列进行比对，形成比对序列组。然后将相近的组进行比对，直到所有序列都比对到一个组中。多重比对中包含的成对比对形成一个新矩阵，用于产生层次聚类。如果与第一个矩阵不同，可以进行该过程的迭代。通过一个例子说明了该方法：细胞色素c的39个序列的全局比对。

相似文献

1

Multiple sequence alignment with hierarchical clustering.采用层次聚类的多序列比对。

Nucleic Acids Res. 1988 Nov 25;16(22):10881-90. doi: 10.1093/nar/16.22.10881.

2

A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons.一种蛋白质序列快速多重比对的策略。来自三级结构比较的置信水平。

J Mol Biol. 1987 Nov 20;198(2):327-37. doi: 10.1016/0022-2836(87)90316-0.

3

Hierarchical method to align large numbers of biological sequences.用于比对大量生物序列的分层方法。

Methods Enzymol. 1990;183:456-74. doi: 10.1016/0076-6879(90)83031-4.

4

A novel approach to local reliability of sequence alignments.一种序列比对局部可靠性的新方法。

Bioinformatics. 2002 Jun;18(6):847-54. doi: 10.1093/bioinformatics/18.6.847.

5

Alignment of protein sequences by their profiles.通过蛋白质序列的图谱进行比对。

Protein Sci. 2004 Apr;13(4):1071-87. doi: 10.1110/ps.03379804.

6

CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.CLUSTAL：一个用于在微型计算机上进行多序列比对的程序包。

Gene. 1988 Dec 15;73(1):237-44. doi: 10.1016/0378-1119(88)90330-7.

7

MISHIMA--a new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data.三岛法——一种用于细菌基因组规模数据的高速多重核苷酸序列比对的新方法。

BMC Bioinformatics. 2010 Mar 18;11:142. doi: 10.1186/1471-2105-11-142.

8

A symmetric-iterated multiple alignment of protein sequences.蛋白质序列的对称迭代多序列比对。

J Mol Biol. 1998 Feb 13;276(1):249-64. doi: 10.1006/jmbi.1997.1527.

9

Search of latent periodicity in amino acid sequences by means of genetic algorithm and dynamic programming.利用遗传算法和动态规划搜索氨基酸序列中的潜在周期性。

Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):381-400. doi: 10.1515/sagmb-2015-0079.

10

Multiple DNA and protein sequence alignment based on segment-to-segment comparison.基于片段对片段比较的多DNA和蛋白质序列比对。

Proc Natl Acad Sci U S A. 1996 Oct 29;93(22):12098-103. doi: 10.1073/pnas.93.22.12098.

引用本文的文献

1

Characterization of the extrinsic and intrinsic signatures and therapeutic vulnerability of small cell lung cancers.小细胞肺癌的外在和内在特征及其治疗易损性的表征

Signal Transduct Target Ther. 2025 Sep 10;10(1):290. doi: 10.1038/s41392-025-02378-6.

2

PiPho85, a cyclin-dependent kinase of Piriformospora indica rescue colonized maize plants grown under salt stress.印度梨形孢的细胞周期蛋白依赖性激酶PiPho85拯救了在盐胁迫下生长的定殖玉米植株。

World J Microbiol Biotechnol. 2025 Sep 9;41(9):322. doi: 10.1007/s11274-025-04513-5.

3

Identification of Ticks on Migratory Birds in Poland During the 2023 and 2024 Spring Seasons.2023年和2024年春季波兰候鸟身上蜱虫的鉴定

Life (Basel). 2025 Aug 19;15(8):1311. doi: 10.3390/life15081311.

4

Pioneering Comparative Proteomic and Enzymatic Profiling of Amazonian Scorpion Venoms Enables the Isolation of Their First α-Ktx, Metalloprotease, and Phospholipase A.亚马逊蝎子毒液的开创性比较蛋白质组学和酶谱分析实现了其首个α-Ktx、金属蛋白酶和磷脂酶A的分离。

Toxins (Basel). 2025 Aug 15;17(8):411. doi: 10.3390/toxins17080411.

5

TRsv: simultaneous detection of tandem repeat variations, structural variations, and short indels using long read sequencing data.TRsv：利用长读长测序数据同时检测串联重复变异、结构变异和短插入缺失变异

Genome Biol. 2025 Aug 20;26(1):246. doi: 10.1186/s13059-025-03718-z.

6

Characterization of strictly lytic phages infecting from Merlot wines and proposal of a new genus.感染梅洛葡萄酒中（某种微生物）的严格裂解性噬菌体的特性分析及一个新属的提议

Microbiol Spectr. 2025 Sep 2;13(9):e0258824. doi: 10.1128/spectrum.02588-24. Epub 2025 Aug 12.

7

Development of an Aptamer-Based qPCR Method for the Selective and Rapid Picomolar-Level Detection of Perfluorooctanesulfonic Acid in Water.基于适配体的定量聚合酶链反应方法用于水中全氟辛烷磺酸的选择性快速皮摩尔级检测的开发。

Environ Sci Technol. 2025 Aug 19;59(32):17247-17257. doi: 10.1021/acs.est.5c04730. Epub 2025 Aug 7.

8

The cytochrome oxidase defect in ISC-depleted yeast is caused by impaired iron-sulfur cluster maturation of the mitoribosome assembly factor Rsm22.ISC 缺失酵母中的细胞色素氧化酶缺陷是由线粒体核糖体组装因子 Rsm22 的铁硫簇成熟受损引起的。

FEBS Lett. 2025 Aug;599(16):2301-2317. doi: 10.1002/1873-3468.70129. Epub 2025 Aug 6.

9

A novel acidic laminarinase derived from Jermuk hot spring metagenome.一种源自杰尔穆克温泉宏基因组的新型酸性海带多糖酶。

Appl Microbiol Biotechnol. 2025 Jul 26;109(1):172. doi: 10.1007/s00253-025-13557-4.

10

Comparative single-cell analyses reveal evolutionary repurposing of a conserved gene programme in bat wing development.比较单细胞分析揭示了蝙蝠翅膀发育中一个保守基因程序的进化重新利用。

Nat Ecol Evol. 2025 Jul 16. doi: 10.1038/s41559-025-02780-x.

本文引用的文献

1

A general method applicable to the search for similarities in the amino acid sequence of two proteins.一种适用于寻找两种蛋白质氨基酸序列相似性的通用方法。

J Mol Biol. 1970 Mar;48(3):443-53. doi: 10.1016/0022-2836(70)90057-4.

2

Simultaneous comparison of three protein sequences.三种蛋白质序列的同步比较。

Proc Natl Acad Sci U S A. 1985 May;82(10):3073-7. doi: 10.1073/pnas.82.10.3073.

3

Multiple sequence alignment.多序列比对

J Mol Biol. 1986 Sep 20;191(2):153-61. doi: 10.1016/0022-2836(86)90252-4.

4

A multiple sequence alignment program.一个多序列比对程序。

Nucleic Acids Res. 1986 Jan 10;14(1):363-74. doi: 10.1093/nar/14.1.363.

5

Evaluation and improvements in the automatic alignment of protein sequences.蛋白质序列自动比对的评估与改进

Protein Eng. 1987 Feb-Mar;1(2):89-94. doi: 10.1093/protein/1.2.89.

6

Profile analysis: detection of distantly related proteins.轮廓分析：检测远亲相关蛋白。

Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8. doi: 10.1073/pnas.84.13.4355.

7

Multiple sequence alignment by a pairwise algorithm.通过成对算法进行多序列比对。

Comput Appl Biosci. 1987 Jun;3(2):81-7. doi: 10.1093/bioinformatics/3.2.81.

8

A multiple alignment program for protein sequences.一种用于蛋白质序列的多重比对程序。

Comput Appl Biosci. 1987 Jun;3(2):111-4. doi: 10.1093/bioinformatics/3.2.111.

9

A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons.一种蛋白质序列快速多重比对的策略。来自三级结构比较的置信水平。

J Mol Biol. 1987 Nov 20;198(2):327-37. doi: 10.1016/0022-2836(87)90316-0.

10

Profile scanning for three-dimensional structural patterns in protein sequences.蛋白质序列中三维结构模式的轮廓扫描

Comput Appl Biosci. 1988 Mar;4(1):61-6. doi: 10.1093/bioinformatics/4.1.61.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验