序列比对中随机参数和非参数屏蔽可以得到改进，并产生更好分辨率的树。

Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

机构信息

Zoologisches Forschungsmuseum A, Koenig, Adenauerallee 160, 53113 Bonn, Germany.

出版信息

Front Zool. 2010 Mar 31;7:10. doi: 10.1186/1742-9994-7-10.

DOI:10.1186/1742-9994-7-10

PMID:20356385

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2867768/

Abstract

BACKGROUND

Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective.

RESULTS

ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict.

CONCLUSIONS

Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.

摘要

背景

对齐掩蔽方法是指在重建树之前排除对齐块的技术，它已成功地提高了序列比对中的信噪比。然而，由于缺乏正式定义的方法来识别序列比对中的随机性，因此无法常规应用对齐掩蔽。在这项研究中，我们比较了最常用的剖析方法（GBLOCKS）与一种新的剖析方法（ALISCORE）的效果，GBLOCKS 方法使用预定义的规则组合与对齐掩蔽，而 ALISCORE 方法基于蒙特卡罗在滑动窗口内进行重采样。我们使用不同的数据和比对方法进行了比较。GBLOCKS 方法排除了超过某个阈值的变量部分，而阈值的选择是任意的，而 ALISCORE 算法没有先验的参数空间评分，因此更客观。

结果

我们成功地将 ALISCORE 扩展到了氨基酸，使用比例模型和经验取代矩阵来对多序列比对中的随机性进行评分。复杂的自举重采样导致随机相似序列的评分均匀分布，以评估观察到的序列相似性的随机性。在真实数据上进行测试性能时，GBLOCKS 和 ALISCORE 这两种掩蔽方法都有助于提高树的分辨率。滑动窗口方法对相同数据集的不同比对不敏感，并且在所有数据集上都表现良好。同时，ALISCORE 能够处理不同的取代模式和异质碱基组成。所有数据集的最佳性能都是由最宽松的 GBLOCKS 间隙参数设置和 ALISCORE 实现的。相应地，邻居网络分析显示冲突减少最多。

结论

对齐掩蔽可以提高系统发生重建前多序列比对中的信噪比。鉴于对齐剖析的稳健性能，应常规使用对齐掩蔽来改善树重建。对齐剖析的参数方法可以很容易地扩展到更复杂的基于似然的序列进化模型，这为进一步的改进提供了可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9516/2867768/7dab7286e05b/1742-9994-7-10-1.jpg

相似文献

Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.序列比对中随机参数和非参数屏蔽可以得到改进，并产生更好分辨率的树。

Front Zool. 2010 Mar 31;7:10. doi: 10.1186/1742-9994-7-10.

A Monte Carlo approach successfully identifies randomness in multiple sequence alignments: a more objective means of data exclusion.一种蒙特卡罗方法成功地识别了多重序列比对中的随机性：一种更客观的数据排除方法。

Syst Biol. 2009 Feb;58(1):21-34. doi: 10.1093/sysbio/syp006. Epub 2009 May 20.

Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.从蛋白质序列比对中去除分歧和比对不明确的区域后系统发育树的改进。

Syst Biol. 2007 Aug;56(4):564-77. doi: 10.1080/10635150701472164.

SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II：一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。

Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.

AliGROOVE--visualization of heterogeneous sequence divergence within multiple sequence alignments and detection of inflated branch support.AliGROOVE--多序列比对中异质序列分歧的可视化和膨胀支支持的检测。

BMC Bioinformatics. 2014 Aug 30;15(1):294. doi: 10.1186/1471-2105-15-294.

Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?利用ESTs进行系统发育基因组学研究：能否从有缺口的比对中准确推断系统发育树？

BMC Evol Biol. 2008 Mar 26;8:95. doi: 10.1186/1471-2148-8-95.

A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.一种对齐掩蔽方法，用于细化多重序列比对的系统发育信号。

Mol Biol Evol. 2013 Mar;30(3):689-712. doi: 10.1093/molbev/mss264. Epub 2012 Nov 27.

The impact of rRNA secondary structure consideration in alignment and tree reconstruction: simulated data and a case study on the phylogeny of hexapods.rRNA 二级结构在比对和系统发育树重建中的影响：模拟数据和六足动物系统发育的案例研究。

Mol Biol Evol. 2010 Nov;27(11):2507-21. doi: 10.1093/molbev/msq140. Epub 2010 Jun 7.

Using CLUSTAL for multiple sequence alignments.使用CLUSTAL进行多序列比对。

Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8.

Ancestral sequence alignment under optimal conditions.在最佳条件下进行祖先序列比对。

BMC Bioinformatics. 2005 Nov 17;6:273. doi: 10.1186/1471-2105-6-273.

引用本文的文献

Population Phylogenomics and Genetic Structure of the Polyphagous Leafminer, (Burgess) (Diptera: Agromyzidae).多食性潜叶蝇（Burgess）（双翅目：潜蝇科）的群体系统基因组学与遗传结构

Evol Appl. 2025 Jul 9;18(7):e70132. doi: 10.1111/eva.70132. eCollection 2025 Jul.

Systematic revision of the genus Orchidasma Looss, 1900 and description of Orchidasma orchilobata n. sp. from the loggerhead Caretta caretta (L.) and Kemp's ridley Lepidochelys kempii (Garman) turtles.1900年对兰花吸虫属（Orchidasma Looss）的系统修订以及来自蠵龟（Caretta caretta (L.)）和肯氏丽龟（Lepidochelys kempii (Garman)）的新物种——兰花吸虫（Orchidasma orchilobata n. sp.）的描述。

Syst Parasitol. 2025 Jun 30;102(4):44. doi: 10.1007/s11230-025-10243-x.

Orthoptera-specific target enrichment (OR-TE) probes resolve relationships over broad phylogenetic scales.直翅目特异性靶向富集（OR-TE）探针可解决广泛的系统发育尺度上的关系。

Sci Rep. 2024 Sep 13;14(1):21377. doi: 10.1038/s41598-024-72622-6.

Eco-evolutionary factors contribute to chemodiversity in aboveground and belowground cucurbit herbivore-induced plant volatiles.生态进化因素促成了地上和地下葫芦科草食动物诱导的植物挥发物中的化学多样性。

Plant Biol (Stuttg). 2024 Aug 20. doi: 10.1111/plb.13709.

Endless forms most frustrating: disentangling species boundaries in the group (), with the description of six new species and a key to the group.无尽的形态最令人沮丧：厘清该类群中的物种界限，同时描述六个新物种并给出该类群的检索表。

Persoonia. 2024 Aug;52:44-93. doi: 10.3767/persoonia.2024.52.03. Epub 2024 May 10.

The evolutionary origins and ancestral features of septins.Septins的进化起源和祖先特征。

Front Cell Dev Biol. 2024 Jun 26;12:1406966. doi: 10.3389/fcell.2024.1406966. eCollection 2024.

The Evolutionary Origins and Ancestral Features of Septins.Septin蛋白家族的进化起源与祖先特征

bioRxiv. 2024 Mar 27:2024.03.25.586683. doi: 10.1101/2024.03.25.586683.

Potential Contribution of Ancient Introgression to the Evolution of a Derived Reproductive Strategy in Ricefishes.古代渗入对稻鱼类衍生生殖策略进化的潜在贡献。

Genome Biol Evol. 2023 Aug 1;15(8). doi: 10.1093/gbe/evad138.

Mitochondrial Genome Evolution in Annelida-A Systematic Study on Conservative and Variable Gene Orders and the Factors Influencing its Evolution.环节动物线粒体基因组进化——保守和可变基因顺序的系统研究及其进化影响因素。

Syst Biol. 2023 Aug 7;72(4):925-945. doi: 10.1093/sysbio/syad023.

A New Parasitic Archamoeba Causing Systemic Granulomatous Disease in Goldfish Extends the Diversity of Pathogenic spp.一种导致金鱼全身性肉芽肿疾病的新型寄生阿米巴扩大了致病性物种的多样性。

Animals (Basel). 2023 Mar 5;13(5):935. doi: 10.3390/ani13050935.

本文引用的文献

Neurophylogeny: Architecture of the nervous system and a fresh view on arthropod phyologeny.神经发生：神经系统的结构和对节肢动物发生的新观点。

Integr Comp Biol. 2006 Apr;46(2):162-94. doi: 10.1093/icb/icj011. Epub 2006 Feb 28.

Syst Biol. 2009 Feb;58(1):21-34. doi: 10.1093/sysbio/syp006. Epub 2009 May 20.

400 million years on six legs: on the origin and early evolution of Hexapoda.六足四亿年：关于六足动物的起源和早期演化。

Arthropod Struct Dev. 2010 Mar-May;39(2-3):191-203. doi: 10.1016/j.asd.2009.10.008. Epub 2009 Nov 18.

Arthropod phylogeny: an overview from the perspectives of morphology, molecular data and the fossil record.节肢动物系统发育：形态学、分子数据和化石记录的综合视角概述。

Arthropod Struct Dev. 2010 Mar-May;39(2-3):74-87. doi: 10.1016/j.asd.2009.10.002. Epub 2009 Nov 10.

Nearly complete rRNA genes assembled from across the metazoan animals: effects of more taxa, a structure-based alignment, and paired-sites evolutionary models on phylogeny reconstruction.从后生动物中组装出近乎完整的 rRNA 基因：更多分类群、基于结构的比对以及配对位点进化模型对系统发育重建的影响。

Mol Phylogenet Evol. 2010 Apr;55(1):1-17. doi: 10.1016/j.ympev.2009.09.028. Epub 2009 Sep 26.

Can comprehensive background knowledge be incorporated into substitution models to improve phylogenetic analyses? A case study on major arthropod relationships.能否将全面的背景知识纳入替代模型以改进系统发育分析？以主要节肢动物关系为例的研究。

BMC Evol Biol. 2009 May 27;9:119. doi: 10.1186/1471-2148-9-119.

Phylogenetic relationships of basal hexapods reconstructed from nearly complete 18S and 28S rRNA gene sequences.基于近乎完整的18S和28S rRNA基因序列重建的基础六足动物的系统发育关系。

Zoolog Sci. 2008 Nov;25(11):1139-45. doi: 10.2108/zsj.25.1139.

Eumalacostracan phylogeny and total evidence: limitations of the usual suspects.真软甲亚纲系统发育与完全证据：常见可疑因素的局限性

BMC Evol Biol. 2009 Jan 27;9:21. doi: 10.1186/1471-2148-9-21.

Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence.解析节肢动物系统发育：探索41kb蛋白质编码核基因序列中的系统发育信号。

Syst Biol. 2008 Dec;57(6):920-38. doi: 10.1080/10635150802570791.

Noisy: identification of problematic columns in multiple sequence alignments.Noisy：识别多序列比对中有问题的列。

Algorithms Mol Biol. 2008 Jun 24;3:7. doi: 10.1186/1748-7188-3-7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

序列比对中随机参数和非参数屏蔽可以得到改进，并产生更好分辨率的树。

Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献