Jablonowski Karl
Division of Emergency Medicine, Department of Medicine, University of Washington, 325 9th Ave., Seattle, WA, USA.
Division of Biomedical and Health Informatics, Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA.
Methods Mol Biol. 2017;1555:99-113. doi: 10.1007/978-1-4939-6762-9_7.
Proteomic clustering analysis provides a means of identifying relationships and visualizing those relationships in an extremely complex field of study with many interacting parts. With recent high-throughput studies of Src Homology 2 (SH2) domains, many and varied datasets are being amassed. A strategy for analyzing patterns between these large datasets is required to transform the information into knowledge. The methods for creating neighbor-joining phylogenetic trees, pairs scatter plots, and two-dimensional hierarchical clustering heatmaps are just a few of the diverse methods available to a proteomic researcher. This chapter examines selecting objects to be analyzed, selecting comparison functions to apply to those objects, and pseudo-code for processing data and preparing it for various types of analyses. Here I apply clustering analysis to previous collections of SH2 domains datasets to bring insight into new binding or specificity patterns between the different SH2 domains.
蛋白质组聚类分析提供了一种在一个由许多相互作用部分组成的极其复杂的研究领域中识别关系并将这些关系可视化的方法。随着近期对Src同源2(SH2)结构域的高通量研究,大量各种各样的数据集正在积累。需要一种分析这些大型数据集之间模式的策略,以便将信息转化为知识。构建邻接法系统发育树、成对散点图和二维层次聚类热图的方法只是蛋白质组学研究人员可用的多种不同方法中的一部分。本章探讨了选择要分析的对象、选择应用于这些对象的比较函数,以及处理数据并为各种类型的分析准备数据的伪代码。在这里,我将聚类分析应用于先前收集的SH2结构域数据集,以深入了解不同SH2结构域之间新的结合或特异性模式。