Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, United States.
Department of Pathology, University of Pittsburgh, Pittsburgh, PA, United States.
J Biomed Inform. 2011 Dec;44 Suppl 1(0 1):S17-S23. doi: 10.1016/j.jbi.2011.04.009. Epub 2011 May 6.
We present a novel framework for integrative biomarker discovery from related but separate data sets created in biomarker profiling studies. The framework takes prior knowledge in the form of interpretable, modular rules, and uses them during the learning of rules on a new data set. The framework consists of two methods of transfer of knowledge from source to target data: transfer of whole rules and transfer of rule structures. We evaluated the methods on three pairs of data sets: one genomic and two proteomic. We used standard measures of classification performance and three novel measures of amount of transfer. Preliminary evaluation shows that whole-rule transfer improves classification performance over using the target data alone, especially when there is more source data than target data. It also improves performance over using the union of the data sets.
我们提出了一个新的框架,用于从相关但分开的生物标志物分析研究中创建的数据集中综合发现生物标志物。该框架采用了可解释的、模块化的规则的先验知识,并在新数据集上的规则学习过程中使用这些规则。该框架包括两种从源数据到目标数据的知识转移方法:整体规则转移和规则结构转移。我们在三个数据集对(一个基因组和两个蛋白质组)上评估了这些方法。我们使用了分类性能的标准度量和三种新的转移量度量。初步评估表明,整体规则转移比单独使用目标数据更能提高分类性能,尤其是在源数据多于目标数据时。它也优于使用数据集的并集。