Besemann Christopher, Denton Anne, Carr Nathan J, Prüss Birgit M
Department of Computer Sciences, North Dakota State University, Fargo ND 58105, USA.
Source Code Biol Med. 2006 Nov 29;1:8. doi: 10.1186/1751-0473-1-8.
The large amount of genomics data that have accumulated over the past decade require extensive data mining. However, the global nature of data mining, which includes pattern mining, poses difficulties for users who want to study specific questions in a more local environment. This creates a need for techniques that allow a localized analysis of globally determined patterns.
We developed a tool that determines and evaluates global patterns based on protein property and network information, while providing all the benefits of a perspective that is targeted at biologist users with specific goals and interests. Our tool uses our own data mining techniques, integrated into current visualization and navigation techniques. The functionality of the tool is discussed in the context of the transcriptional network of regulation in the enteric bacterium Escherichia coli. Two biological questions were asked: (i) Which functional categories of proteins (identified by hidden Markov models) are regulated by a regulator with a specific domain? (ii) Which regulators are involved in the regulation of proteins that contain a common hidden Markov model? Using these examples, we explain the gene-centered and pattern-centered analysis that the tool permits.
In summary, we have a tool that can be used for a wide variety of applications in biology, medicine, or agriculture. The pattern mining engine is global in the way that patterns are determined across the entire network. The tool still permits a localized analysis for users who want to analyze a subportion of the total network. We have named the tool BISON (Bio-Interface for the Semi-global analysis Of Network patterns).
在过去十年中积累的大量基因组学数据需要进行广泛的数据挖掘。然而,数据挖掘的全局性,包括模式挖掘,给那些想要在更局部的环境中研究特定问题的用户带来了困难。这就产生了对能够对全局确定的模式进行局部分析的技术的需求。
我们开发了一种工具,该工具基于蛋白质特性和网络信息来确定和评估全局模式,同时为有特定目标和兴趣的生物学家用户提供针对性视角的所有优势。我们的工具使用了我们自己的数据挖掘技术,并集成到当前的可视化和导航技术中。该工具的功能在肠道细菌大肠杆菌的转录调控网络背景下进行了讨论。提出了两个生物学问题:(i)具有特定结构域的调节因子调控哪些功能类别的蛋白质(通过隐马尔可夫模型识别)?(ii)哪些调节因子参与了包含共同隐马尔可夫模型的蛋白质的调控?通过这些例子,我们解释了该工具允许的以基因为中心和以模式为中心的分析。
总之,我们有一个可用于生物学、医学或农业中各种应用的工具。模式挖掘引擎在跨整个网络确定模式的方式上是全局性的。该工具仍然允许想要分析整个网络一部分的用户进行局部分析。我们将该工具命名为BISON(网络模式半全局分析的生物接口)。