Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA.
BMC Bioinformatics. 2011 Jan 24;12:32. doi: 10.1186/1471-2105-12-32.
Caenorhabditis elegans gene-based phenotype information dates back to the 1970's, beginning with Sydney Brenner and the characterization of behavioral and morphological mutant alleles via classical genetics in order to understand nervous system function. Since then C. elegans has become an important genetic model system for the study of basic biological and biomedical principles, largely through the use of phenotype analysis. Because of the growth of C. elegans as a genetically tractable model organism and the development of large-scale analyses, there has been a significant increase of phenotype data that needs to be managed and made accessible to the research community. To do so, a standardized vocabulary is necessary to integrate phenotype data from diverse sources, permit integration with other data types and render the data in a computable form.
We describe a hierarchically structured, controlled vocabulary of terms that can be used to standardize phenotype descriptions in C. elegans, namely the Worm Phenotype Ontology (WPO). The WPO is currently comprised of 1,880 phenotype terms, 74% of which have been used in the annotation of phenotypes associated with greater than 18,000 C. elegans genes. The scope of the WPO is not exclusively limited to C. elegans biology, rather it is devised to also incorporate phenotypes observed in related nematode species. We have enriched the value of the WPO by integrating it with other ontologies, thereby increasing the accessibility of worm phenotypes to non-nematode biologists. We are actively developing the WPO to continue to fulfill the evolving needs of the scientific community and hope to engage researchers in this crucial endeavor.
We provide a phenotype ontology (WPO) that will help to facilitate data retrieval, and cross-species comparisons within the nematode community. In the larger scientific community, the WPO will permit data integration, and interoperability across the different Model Organism Databases (MODs) and other biological databases. This standardized phenotype ontology will therefore allow for more complex data queries and enhance bioinformatic analyses.
秀丽隐杆线虫的基于基因的表型信息可以追溯到 20 世纪 70 年代,当时悉尼·布伦纳(Sydney Brenner)通过经典遗传学对行为和形态突变等位基因进行了特征描述,以便了解神经系统功能。从那时起,秀丽隐杆线虫已成为研究基本生物学和生物医学原理的重要遗传模型系统,主要是通过表型分析。由于秀丽隐杆线虫作为一种可遗传操作的模式生物的发展以及大规模分析的发展,需要管理和向研究界提供大量的表型数据。为此,需要使用标准化词汇来整合来自不同来源的表型数据,允许与其他数据类型集成,并以可计算的形式呈现数据。
我们描述了一个层次结构的受控词汇表,该词汇表可用于标准化秀丽隐杆线虫的表型描述,即线虫表型本体(Worm Phenotype Ontology,WPO)。WPO 目前包含 1880 个表型术语,其中 74%已用于注释与超过 18000 个秀丽隐杆线虫基因相关的表型。WPO 的范围不仅限于秀丽隐杆线虫生物学,而是旨在还纳入相关线虫物种中观察到的表型。我们通过将其与其他本体集成来丰富 WPO 的价值,从而提高非线虫生物学家对线虫表型的可访问性。我们正在积极开发 WPO,以继续满足科学界不断发展的需求,并希望吸引研究人员参与这一关键工作。
我们提供了一个表型本体(WPO),这将有助于促进线虫社区内的数据检索和跨物种比较。在更大的科学界中,WPO 将允许不同模型生物数据库(MOD)和其他生物数据库之间进行数据集成和互操作。因此,这个标准化的表型本体将允许进行更复杂的数据查询并增强生物信息学分析。