Hyduke Daniel R, Lewis Nathan E, Palsson Bernhard Ø
Department of Bioengineering, University of California - San Diego, La Jolla, CA 92093-0412, USA.
Mol Biosyst. 2013 Feb 2;9(2):167-74. doi: 10.1039/c2mb25453k. Epub 2012 Dec 18.
Over the past decade a massive amount of research has been dedicated to generating omics data to gain insight into a variety of biological phenomena, including cancer, obesity, biofuel production, and infection. Although most of these omics data are available publicly, there is a growing concern that much of these data sit in databases without being used or fully analyzed. Statistical inference methods have been widely applied to gain insight into which genes may influence the activities of others in a given omics data set, however, they do not provide information on the underlying mechanisms or whether the interactions are direct or distal. Biochemically, genetically, and genomically consistent knowledge bases are increasingly being used to extract deeper biological knowledge and understanding from these data sets than possible by inferential methods. This improvement is largely due to knowledge bases providing a validated biological context for interpreting the data.
在过去十年中,大量研究致力于生成组学数据,以深入了解各种生物学现象,包括癌症、肥胖症、生物燃料生产和感染。尽管这些组学数据大多可公开获取,但人们越来越担心其中许多数据存于数据库中未被使用或充分分析。统计推断方法已被广泛应用,以深入了解在给定的组学数据集中哪些基因可能影响其他基因的活性,然而,它们并未提供关于潜在机制的信息,也未说明相互作用是直接的还是间接的。生化、遗传和基因组学上一致的知识库正越来越多地用于从这些数据集中提取比推断方法所能获得的更深层次的生物学知识和理解。这种改进很大程度上归功于知识库为解释数据提供了经过验证的生物学背景。