Strong Michael, Graeber Thomas G, Beeby Morgan, Pellegrini Matteo, Thompson Michael J, Yeates Todd O, Eisenberg David
Howard Hughes Medical Institute, University of California at Los Angeles, Box 951570, Los Angeles, CA 90095-1570, USA.
Nucleic Acids Res. 2003 Dec 15;31(24):7099-109. doi: 10.1093/nar/gkg924.
Genome-wide functional linkages among proteins in cellular complexes and metabolic pathways can be inferred from high throughput experimentation, such as DNA microarrays, or from bioinformatic analyses. Here we describe a method for the visualization and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon and Conserved Gene Neighbor computational methods. This method involves the construction of a genome-wide functional linkage map, where each significant functional linkage between a pair of proteins is displayed on a two-dimensional scatter-plot, organized according to the order of genes along the chromosome. Subsequent hierarchical clustering of the map reveals clusters of genes with similar functional linkage profiles and facilitates the inference of protein function and the discovery of functionally linked gene clusters throughout the genome. We illustrate this method by applying it to the genome of the pathogenic bacterium Mycobacterium tuberculosis, assigning cellular functions to previously uncharacterized proteins involved in cell wall biosynthesis, signal transduction, chaperone activity, energy metabolism and polysaccharide biosynthesis.
细胞复合物和代谢途径中蛋白质之间的全基因组功能联系可以从高通量实验(如DNA微阵列)或生物信息学分析中推断出来。在这里,我们描述了一种用于可视化和解释通过罗塞塔石碑、系统发育谱、操纵子和保守基因邻域计算方法推断出的全基因组功能联系的方法。该方法涉及构建全基因组功能联系图,其中一对蛋白质之间的每个重要功能联系都显示在二维散点图上,并根据沿染色体的基因顺序进行组织。随后对该图进行层次聚类,揭示具有相似功能联系谱的基因簇,并有助于推断蛋白质功能以及在整个基因组中发现功能相关的基因簇。我们通过将该方法应用于致病细菌结核分枝杆菌的基因组来举例说明,为参与细胞壁生物合成、信号转导、伴侣活性、能量代谢和多糖生物合成的先前未表征的蛋白质赋予细胞功能。