Groth Philip, Leser Ulf, Weiss Bertram
Research Laboratories, Bayer Schering Pharma AG, 13442, Berlin, Germany.
Methods Mol Biol. 2011;760:159-73. doi: 10.1007/978-1-61779-176-5_10.
In gene prediction, studying phenotypes is highly valuable for reducing the number of locus candidates in association studies and to aid disease gene candidate prioritization. This is due to the intrinsic nature of phenotypes to visibly reflect genetic activity, making them potentially one of the most useful data types for functional studies. However, systematic use of these data has begun only recently. 'Comparative phenomics' is the analysis of genotype-phenotype associations across species and experimental methods. This is an emerging research field of utmost importance for gene discovery and gene function annotation. In this chapter, we review the use of phenotype data in the biomedical field. We will give an overview of phenotype resources, focusing on PhenomicDB--a cross-species genotype-phenotype database--which is the largest available collection of phenotype descriptions across species and experimental methods. We report on its latest extension by which genotype-phenotype relationships can be viewed as graphical representations of similar phenotypes clustered together ('phenoclusters'), supplemented with information from protein-protein interactions and Gene Ontology terms. We show that such 'phenoclusters' represent a novel approach to group genes functionally and to predict novel gene functions with high precision. We explain how these data and methods can be used to supplement the results of gene discovery approaches. The aim of this chapter is to assist researchers interested in understanding how phenotype data can be used effectively in the gene discovery field.
在基因预测中,研究表型对于减少关联研究中基因座候选数量以及辅助疾病基因候选物的优先级排序具有极高价值。这是因为表型具有明显反映基因活性的内在特性,使其有可能成为功能研究中最有用的数据类型之一。然而,对这些数据的系统应用直到最近才开始。“比较表型组学”是对跨物种和实验方法的基因型 - 表型关联进行分析。这是一个对基因发现和基因功能注释极为重要的新兴研究领域。在本章中,我们回顾了表型数据在生物医学领域的应用。我们将概述表型资源,重点介绍PhenomicDB——一个跨物种基因型 - 表型数据库,它是跨物种和实验方法的最大可用表型描述集合。我们报告了它的最新扩展,通过该扩展,基因型 - 表型关系可以被视为聚集在一起的相似表型的图形表示(“表型簇”),并辅以蛋白质 - 蛋白质相互作用和基因本体术语的信息。我们表明,这种“表型簇”代表了一种在功能上对基因进行分组并高精度预测新基因功能的新方法。我们解释了如何使用这些数据和方法来补充基因发现方法的结果。本章的目的是帮助有兴趣了解表型数据如何在基因发现领域有效使用的研究人员。