Suppr超能文献

轮廓网格作为大型多序列比对的一种新的可视化表示形式:RecA蛋白家族的案例研究

ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family.

作者信息

Roca Alberto I, Almada Albert E, Abajian Aaron C

机构信息

Department of Molecular Biology and Biochemistry, 560 Steinhaus Hall, University of California, Irvine, California 92697-3900, USA.

出版信息

BMC Bioinformatics. 2008 Dec 22;9:554. doi: 10.1186/1471-2105-9-554.

Abstract

BACKGROUND

Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation.

RESULTS

We introduce ProfileGrids that represent a multiple sequence alignment as a matrix color-coded according to the residue frequency occurring at each column position. JProfileGrid is a Java application for computing and analyzing ProfileGrids. A dynamic interaction with the alignment information is achieved by changing the ProfileGrid color scheme, by extracting sequence subsets at selected residues of interest, and by relating alignment information to residue physical properties. Conserved family motifs can be identified by the overlay of similarity plot calculations on a ProfileGrid. Figures suitable for publication can be generated from the saved spreadsheet output of the colored matrices as well as by the export of conservation information for use in the PyMOL molecular visualization program.We demonstrate the utility of ProfileGrids on 300 bacterial homologs of the RecA family - a universally conserved protein involved in DNA recombination and repair. Careful attention was paid to curating the collected RecA sequences since ProfileGrids allow the easy identification of rare residues in an alignment. We relate the RecA alignment sequence conservation to the following three topics: the recently identified DNA binding residues, the unexplored MAW motif, and a unique Bacillus subtilis RecA homolog sequence feature.

CONCLUSION

ProfileGrids allow large protein families to be visualized more effectively than the traditional stacked sequence alignment form. This new graphical representation facilitates the determination of the sequence conservation at residue positions of interest, enables the examination of structural patterns by using residue physical properties, and permits the display of rare sequence features within the context of an entire alignment. JProfileGrid is free for non-commercial use and is available from http://www.profilegrid.org. Furthermore, we present a curated RecA protein collection that is more diverse than previous data sets; and, therefore, this RecA ProfileGrid is a rich source of information for nanoanatomy analysis.

摘要

背景

多序列比对是蛋白质和核酸比较分析的基本工具。然而,使用传统的堆叠序列比对表示法,大型数据集已不再便于可视化和研究。

结果

我们引入了ProfileGrids,它将多序列比对表示为一个矩阵,根据每列位置出现的残基频率进行颜色编码。JProfileGrid是一个用于计算和分析ProfileGrids的Java应用程序。通过更改ProfileGrid配色方案、在选定的感兴趣残基处提取序列子集以及将比对信息与残基物理性质相关联,可实现与比对信息的动态交互。保守的家族基序可通过在ProfileGrid上叠加相似性图计算来识别。适合发表的图表可从彩色矩阵的保存电子表格输出中生成,也可通过导出保守信息以供PyMOL分子可视化程序使用来生成。我们在RecA家族的300个细菌同源物上展示了ProfileGrids的实用性——RecA家族是一种普遍保守的蛋白质,参与DNA重组和修复。由于ProfileGrids能够轻松识别比对中的稀有残基,因此在整理收集到的RecA序列时我们格外小心。我们将RecA比对序列保守性与以下三个主题相关联:最近鉴定出的DNA结合残基、未探索的MAW基序以及独特的枯草芽孢杆菌RecA同源物序列特征。

结论

与传统的堆叠序列比对形式相比,ProfileGrids能更有效地可视化大型蛋白质家族。这种新的图形表示法有助于确定感兴趣残基位置的序列保守性,能够利用残基物理性质检查结构模式,并允许在整个比对的背景下展示稀有序列特征。JProfileGrid可免费用于非商业用途,可从http://www.profilegrid.org获取。此外,我们提供了一个经过整理的RecA蛋白质集合,它比以前的数据集更加多样;因此,这个RecA ProfileGrid是纳米解剖分析的丰富信息来源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d947/2663765/f069d13deeef/1471-2105-9-554-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验