Vaskin Y Y, Khomicheva I V, Ignatieva E V, Vityaev E E
Novosibirsk State University, Department of Information Technology, Novosibirsk, Russia.
In Silico Biol. 2011;11(3-4):97-108. doi: 10.3233/ISB-2012-0448.
The task of automatic extraction of the hierarchical structure of eukaryotic gene regulatory regions is in the junction of the fields of biology, mathematics and information technologies. A solution of the problem involves understanding of sophisticated mechanisms of eukaryotic gene regulation and applying advanced data mining technologies. In the paper the integrated system, implementing a powerful relation mining of biological data method, is discussed. The system allows taking into account prior information about the gene regulatory regions that is known by the biologist, performing the analysis on each hierarchical level, searching for a solution from a simple hypothesis to a complex one. The integration of ExpertDiscovery system into UGENE toolkit provides a convenient environment for conducting complex research and automating the work of a biologist. For demonstration, the system has been applied for recognition of SF1, SREBP, HNF4 vertebrate binding sites and for the analysis the human gene regulatory regions that promote liver-specific transcription.
真核基因调控区域层次结构的自动提取任务处于生物学、数学和信息技术领域的交叉点。该问题的解决涉及对真核基因调控复杂机制的理解以及应用先进的数据挖掘技术。本文讨论了一个集成系统,该系统实施了一种强大的生物数据关系挖掘方法。该系统允许考虑生物学家已知的关于基因调控区域的先验信息,在每个层次水平上进行分析,从简单假设到复杂假设寻找解决方案。将ExpertDiscovery系统集成到UGENE工具包中为进行复杂研究和使生物学家的工作自动化提供了一个便利的环境。为作演示,该系统已应用于识别脊椎动物的SF1、SREBP、HNF4结合位点以及分析促进肝脏特异性转录的人类基因调控区域。