Ojasoo T, Raynaud J P, Doé J C
Groupe Cristallographie et Simulations Interactives des Macromolécules Biologiques, Université Pierre et Marie Curie (Paris VI), France.
J Steroid Biochem Mol Biol. 1994 Jan;48(1):31-46. doi: 10.1016/0960-0760(94)90248-8.
To illustrate the informative value of descriptive multivariate analysis in biochemical screening, we have analyzed several data matrices relating to the binding of steroids to the estrogen, progestin, androgen, glucocorticoid and mineralocorticoid receptors in different organs and species. We first compared dendrograms of steroid hormone receptors, that were obtained by an automatic hierarchical classification analysis of the binding data, to published phylogenetic trees of nuclear receptors based on amino-acid sequence analysis. The former classification describes the affiliations among the receptors as given by the binding specificity of a population of 187 steroids in a traditional cytosol binding assay (an indirect comparison of ligand binding sites); the latter describes the affiliations among the receptors as given by a comparison of selected primary sequences involved in ligand-dependent regulation of transactivation and dimerization. A similar hierarchical classification was also performed on the binding data of 62 steroids to myometrium cytosol from different species in order to show to what extent the progesterone-binding proteins in these species are affiliated. Hierarchical clustering methods classify each type of variable (receptor or steroid) independently. In order to be able to correlate both types of variable (receptors and steroids) on single-display graphs, it is necessary to resort to correspondence factorial analysis (CFA). CFA ranks the information content within the experimental system, highlighting major correlations and disclosing secondary correlations by eliminating redundant information and background noise. This multivariate method, applied to the analysis of published data, illustrated the particular specificity of estrogen binding in human vagina and raised the question of the nature of the binding protein in this tissue. Our examples are based on small data tables that can and have been analyzed de visu. However, it is certain that such descriptive multivariate techniques are indispensable for the analysis of large data banks not only to define structure-activity relationships but to estimate the degrees of affiliation among the biological variables being measured. Knowledge of such affiliations will help to organize available information in a context where the complexity of the biological systems under study is becoming increasingly apparent.
为了阐明描述性多变量分析在生化筛查中的信息价值,我们分析了几个与不同器官和物种中类固醇与雌激素、孕激素、雄激素、糖皮质激素和盐皮质激素受体结合相关的数据矩阵。我们首先将通过对结合数据进行自动层次分类分析获得的类固醇激素受体树状图,与基于氨基酸序列分析的已发表的核受体系统发育树进行比较。前一种分类描述了在传统的胞质溶胶结合测定中(配体结合位点的间接比较)187种类固醇群体的结合特异性所给出的受体之间的关联;后一种分类描述了通过比较参与转录激活和二聚化的配体依赖性调节的选定一级序列所给出的受体之间的关联。为了表明这些物种中的孕酮结合蛋白在多大程度上相关,还对62种类固醇与不同物种子宫肌层胞质溶胶的结合数据进行了类似的层次分类。层次聚类方法独立地对每种类型的变量(受体或类固醇)进行分类。为了能够在单显示图上关联这两种类型的变量(受体和类固醇),有必要采用对应因子分析(CFA)。CFA对实验系统中的信息含量进行排序,通过消除冗余信息和背景噪声突出主要相关性并揭示次要相关性。这种多变量方法应用于已发表数据的分析,阐明了人阴道中雌激素结合的特殊特异性,并提出了该组织中结合蛋白性质的问题。我们的例子基于可以并且已经直观分析的小数据表。然而,可以肯定的是,这种描述性多变量技术对于分析大型数据库不仅对于定义构效关系而且对于估计所测量的生物变量之间的关联程度都是不可或缺的。了解这种关联将有助于在正在研究的生物系统的复杂性日益明显的背景下组织可用信息。