Shyu Chi-Ren, Harnsomburana Jaturon, Green Jason, Barb Adrian S, Kazic Toni, Schaeffer Mary, Coe Ed
Computer Science Department, University of Missouri, Columbia, MO 65211, USA.
J Bioinform Comput Biol. 2007 Dec;5(6):1193-213. doi: 10.1142/s0219720007003181.
There are thousands of maize mutants, which are invaluable resources for plant research. Geneticists use them to study underlying mechanisms of biochemistry, cell biology, cell development, and cell physiology. To streamline the understanding of such complex processes, researchers need the most current versions of genetic and physical maps, tools with the ability to recognize novel phenotypes or classify known phenotypes, and an intimate knowledge of the biochemical processes generating physiological and phenotypic effects. They must also know how all of these factors change and differ among species, diverse alleles, germplasms, and environmental conditions. While there are robust databases, such as MaizeGDB, for some of these types of raw data, other crucial components are missing. Moreover, the management of visually observed mutant phenotypes is still in its infant stage, let alone the complex query methods that can draw upon high-level and aggregated information to answer the questions of geneticists. In this paper, we address the scientific challenge and propose to develop a robust framework for managing the knowledge of visually observed phenotypes, mining the correlation of visual characteristics with genetic maps, and discovering the knowledge relating to cross-species conservation of visual and genetic patterns. The ultimate goal of this research is to allow a geneticist to submit phenotypic and genomic information on a mutant to a knowledge base and ask, "What genes or environmental factors cause this visually observed phenotype?".
玉米突变体有数千种,是植物研究的宝贵资源。遗传学家利用它们来研究生物化学、细胞生物学、细胞发育和细胞生理学的潜在机制。为了简化对这些复杂过程的理解,研究人员需要最新版本的遗传图谱和物理图谱、能够识别新表型或对已知表型进行分类的工具,以及对产生生理和表型效应的生化过程的深入了解。他们还必须知道所有这些因素在物种、不同等位基因、种质和环境条件之间是如何变化和不同的。虽然有一些强大的数据库,如玉米基因组数据库(MaizeGDB),可用于存储某些类型的原始数据,但其他关键组件仍缺失。此外,对视觉观察到的突变体表型的管理仍处于起步阶段,更不用说能够利用高级汇总信息来回答遗传学家问题的复杂查询方法了。在本文中,我们应对这一科学挑战,建议开发一个强大的框架,用于管理视觉观察到的表型知识、挖掘视觉特征与遗传图谱的相关性,以及发现与视觉和遗传模式的跨物种保守性相关的知识。这项研究的最终目标是让遗传学家能够将突变体的表型和基因组信息提交到知识库,并提问:“哪些基因或环境因素导致了这种视觉观察到的表型?”