Simmonds Peter
Centre for Immunity, Infection and Evolution, University of Edinburgh, Ashworth Laboratories, Kings Buildings, West Main Road, Edinburgh EH9 3JT, UK.
BMC Res Notes. 2012 Jan 20;5:50. doi: 10.1186/1756-0500-5-50.
There is an increasing need to develop bioinformatic tools to organise and analyse the rapidly growing amount of nucleotide and amino acid sequence data in organisms ranging from viruses to eukaryotes.
A simple sequence editor (SSE) was developed to create an integrated environment where sequences can be aligned, annotated, classified and directly analysed by a number of built-in bioinformatic programs. SSE incorporates a sequence editor for the creation of sequence alignments, a process assisted by integrated CLUSTAL/MUSCLE alignment programs and automated removal of indels. Sequences can be fully annotated and classified into groups and annotated of sequences and sequence groups and access to analytical programs that analyse diversity, recombination and RNA secondary structure. Methods for analysing sequence diversity include measures of divergence and evolutionary distances, identity plots to detect regions of nucleotide or amino acid homology, reconstruction of sequence changes, mono-, di- and higher order nucleotide compositional biases and codon usage.Association Index calculations, GroupScans, Bootscanning and TreeOrder scans perform phylogenetic analyses that reconcile group membership with tree branching orders and provide powerful methods for examining segregation of alleles and detection of recombination events. Phylogeny changes across alignments and scoring of branching order differences between trees using the Robinson-Fould algorithm allow effective visualisation of the sites of recombination events.RNA secondary and tertiary structures play important roles in gene expression and RNA virus replication. For the latter, persistence of infection is additionally associated with pervasive RNA secondary structure throughout viral genomic RNA that modulates interactions with innate cell defences. SSE provides several programs to scan alignments for RNA secondary structure through folding energy thermodynamic calculations and phylogenetic methods (detection of co-variant changes, and structure conservation between divergent sequences). These analyses complement methods based on detection of sequence constraints, such as suppression of synonymous site variability.For each program, results can be plotted in real time during analysis through an integrated graphics package, providing publication quality graphs. Results can be also directed to tabulated datafiles for import into spreadsheet or database programs for further analysis.
SSE combines sequence editor functions with analytical tools in a comprehensive and user-friendly package that assists considerably in bioinformatic and evolution research.
开发生物信息学工具来组织和分析从病毒到真核生物等生物体中迅速增长的核苷酸和氨基酸序列数据的需求日益增加。
开发了一种简单序列编辑器(SSE),以创建一个集成环境,在该环境中,序列可以通过许多内置的生物信息学程序进行比对、注释、分类和直接分析。SSE包含一个用于创建序列比对的序列编辑器,该过程由集成的CLUSTAL/MUSCLE比对程序辅助,并自动去除插入缺失。序列可以被完全注释并分类为组,对序列和序列组进行注释,并可访问分析多样性、重组和RNA二级结构的分析程序。分析序列多样性的方法包括分歧度和进化距离的测量、用于检测核苷酸或氨基酸同源区域的同一性图、序列变化的重建、单核苷酸、双核苷酸和高阶核苷酸组成偏差以及密码子使用情况。关联指数计算、群组扫描、引导扫描和树序扫描进行系统发育分析,使组成员身份与树的分支顺序相协调,并为检查等位基因分离和检测重组事件提供强大方法。使用罗宾逊-福尔兹算法对整个比对中的系统发育变化和树之间分支顺序差异进行评分,可有效可视化重组事件的位点。RNA二级和三级结构在基因表达和RNA病毒复制中起重要作用。对于后者,感染的持续存在还与整个病毒基因组RNA中普遍存在的RNA二级结构相关,该结构调节与先天细胞防御的相互作用。SSE提供了几个程序,通过折叠能量热力学计算和系统发育方法(检测共变变化以及不同序列之间的结构保守性)来扫描比对中的RNA二级结构。这些分析补充了基于检测序列限制的方法,例如同义位点变异性的抑制。对于每个程序,结果可以在分析过程中通过集成图形包实时绘制,提供可用于发表的高质量图表。结果也可以导出到表格数据文件中,以便导入电子表格或数据库程序进行进一步分析。
SSE将序列编辑器功能与分析工具结合在一个全面且用户友好的软件包中,极大地有助于生物信息学和进化研究。