Shoudai T, Lappe M, Miyano S, Shinohara A, Okazaki T, Arikawa S, Uchida T, Shimozono S, Shinohara T, Kuhara S
Department of Physics, Kyushu University, Fukuoka, Japan.
Proc Int Conf Intell Syst Mol Biol. 1995;3:359-66.
We have developed a machine discovery system BONSAI which receives positive and negative examples as inputs and produces as a hypothesis a pair of a decision tree over regular patterns and an alphabet indexing. This system has succeeded in discovering reasonable knowledge on transmembrane domain sequences and signal peptide sequences by computer experiments. However, when several kinds of sequences are mixed in the data, it does not seem reasonable for a single BONSAI system to find a hypothesis of a reasonably small size with high accuracy. For this purpose, we have designed a system BONSAI Garden, in which several BONSAI's and a program called Gardener run over a network in parallel, to partition the data into some number of classes together with hypotheses explaining these classes accurately.
我们开发了一个机器发现系统BONSAI,它将正例和反例作为输入,并生成一个假设,该假设是由基于正则模式的决策树和字母索引组成的对。通过计算机实验,该系统成功地发现了关于跨膜结构域序列和信号肽序列的合理知识。然而,当数据中混合了几种不同的序列时,单个BONSAI系统要找到一个尺寸合理且准确率高的假设似乎并不合理。为此,我们设计了一个系统BONSAI Garden,其中多个BONSAI和一个名为Gardener的程序在网络上并行运行,以便将数据划分为若干类,并准确地解释这些类的假设。