Barlowe Scott, Coan Heather B, Youker Robert T
Department of Mathematics and Computer Science, Western Carolina University, Cullowhee, NC, United States of America.
Department of Biology, Western Carolina University, Cullowhee, NC, United States of America.
PeerJ. 2017 Jun 27;5:e3492. doi: 10.7717/peerj.3492. eCollection 2017.
Understanding how proteins mutate is critical to solving a host of biological problems. Mutations occur when an amino acid is substituted for another in a protein sequence. The set of likelihoods for amino acid substitutions is stored in a matrix and input to alignment algorithms. The quality of the resulting alignment is used to assess the similarity of two or more sequences and can vary according to assumptions modeled by the substitution matrix. Substitution strategies with minor parameter variations are often grouped together in families. For example, the BLOSUM and PAM matrix families are commonly used because they provide a standard, predefined way of modeling substitutions. However, researchers often do not know if a given matrix family or any individual matrix within a family is the most suitable. Furthermore, predefined matrix families may inaccurately reflect a particular hypothesis that a researcher wishes to model or otherwise result in unsatisfactory alignments. In these cases, the ability to compare the effects of one or more custom matrices may be needed. This laborious process is often performed manually because the ability to simultaneously load multiple matrices and then compare their effects on alignments is not readily available in current software tools. This paper presents SubVis, an interactive R package for loading and applying multiple substitution matrices to pairwise alignments. Users can simultaneously explore alignments resulting from multiple predefined and custom substitution matrices. SubVis utilizes several of the alignment functions found in R, a common language among protein scientists. Functions are tied together with the Shiny platform which allows the modification of input parameters. Information regarding alignment quality and individual amino acid substitutions is displayed with the JavaScript language which provides interactive visualizations for revealing both high-level and low-level alignment information.
了解蛋白质如何发生突变对于解决一系列生物学问题至关重要。当蛋白质序列中的一个氨基酸被另一个氨基酸取代时,就会发生突变。氨基酸取代的可能性集合存储在一个矩阵中,并输入到比对算法中。所得比对的质量用于评估两个或多个序列的相似性,并且可能会根据取代矩阵所建模的假设而有所不同。参数变化较小的取代策略通常被归为一类。例如,BLOSUM和PAM矩阵家族被广泛使用,因为它们提供了一种标准的、预定义的取代建模方式。然而,研究人员通常不知道给定的矩阵家族或家族中的任何单个矩阵是否是最合适的。此外,预定义的矩阵家族可能无法准确反映研究人员希望建模的特定假设,或者导致比对结果不理想。在这些情况下,可能需要比较一个或多个自定义矩阵的效果。这个繁琐的过程通常是手动进行的,因为当前的软件工具中不容易同时加载多个矩阵,然后比较它们对比对的影响。本文介绍了SubVis,这是一个用于加载和将多个取代矩阵应用于成对比对的交互式R包。用户可以同时探索由多个预定义和自定义取代矩阵产生的比对结果。SubVis利用了R语言中的几个比对函数,R是蛋白质科学家常用的一种语言。这些函数与Shiny平台结合在一起,该平台允许修改输入参数。关于比对质量和单个氨基酸取代的信息使用JavaScript语言显示,该语言提供交互式可视化,以揭示高级和低级比对信息。