Department of Industrial & Information Systems Engineering, Ajou University, Suwon 443-749, Korea.
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S51. doi: 10.1186/1471-2105-12-S1-S51.
Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.
An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.
When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF.
We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.
尽管许多生物数据库都在应用语义 Web 技术,但仍难以轻松地进行有意义的生物学假设检验。基于数据库的高通量基因组假设检验需要同时具备获取语义相关实验数据的能力,以及对检索到的数据进行相关统计检验的能力。组织微阵列(TMA)数据语义丰富,包含许多等待高通量结论的重要生物学假设。
我们使用语义 Web 技术开发了一个特定于应用程序的本体,用于管理 TMA 和 DNA 微阵列数据库。根据本体的框架,数据表示为资源描述框架(RDF)。通过(1)从 TMA 实验中推导出的假设的语法和语义结构的公式化,(2)制定反映假设语义结构的 SPARQL,以及(3)使用 SPARQL 返回的结果集执行统计检验,设计和实现了用于 TMA 数据的假设检验应用程序(Xperanto-RDF)。
当用户在 Xperanto-RDF 中设计假设并提交时,该假设可以针对存储在 Xperanto-RDF 中的 TMA 实验数据进行测试。当我们评估四个先前验证的假设作为示例时,所有假设都得到了 Xperanto-RDF 的支持。
我们展示了高通量生物学假设检验的实用性。我们相信,在进行高度受控的实验之前进行初步研究将受益。