Gordon Paul M K, Sensen Christoph W
University of Calgary, Faculty of Medicine, Sun Center of Excellence for Visual Genomics, Calgary, AB, Canada.
BMC Bioinformatics. 2007 Jun 18;8:208. doi: 10.1186/1471-2105-8-208.
Traditional HTML interfaces for input to and output from Bioinformatics analysis on the Web are highly variable in style, content and data formats. Combining multiple analyses can therefore be an onerous task for biologists. Semantic Web Services allow automated discovery of conceptual links between remote data analysis servers. A shared data ontology and service discovery/execution framework is particularly attractive in Bioinformatics, where data and services are often both disparate and distributed. Instead of biologists copying, pasting and reformatting data between various Web sites, Semantic Web Service protocols such as MOBY-S hold out the promise of seamlessly integrating multi-step analysis.
We have developed a program (Seahawk) that allows biologists to intuitively and seamlessly chain together Web Services using a data-centric, rather than the customary service-centric approach. The approach is illustrated with a ferredoxin mutation analysis. Seahawk concentrates on lowering entry barriers for biologists: no prior knowledge of the data ontology, or relevant services is required. In stark contrast to other MOBY-S clients, in Seahawk users simply load Web pages and text files they already work with. Underlying the familiar Web-browser interaction is an XML data engine based on extensible XSLT style sheets, regular expressions, and XPath statements which import existing user data into the MOBY-S format.
As an easily accessible applet, Seahawk moves beyond standard Web browser interaction, providing mechanisms for the biologist to concentrate on the analytical task rather than on the technical details of data formats and Web forms. As the MOBY-S protocol nears a 1.0 specification, we expect more biologists to adopt these new semantic-oriented ways of doing Web-based analysis, which empower them to do more complicated, ad hoc analysis workflow creation without the assistance of a programmer.
用于网络生物信息学分析输入和输出的传统HTML界面在样式、内容和数据格式上差异很大。因此,对生物学家来说,组合多个分析可能是一项艰巨的任务。语义网络服务允许自动发现远程数据分析服务器之间的概念性链接。在生物信息学领域,共享的数据本体和服务发现/执行框架特别有吸引力,因为数据和服务往往既分散又分布。与生物学家在各个网站之间复制、粘贴和重新格式化数据不同,诸如MOBY-S之类的语义网络服务协议有望无缝集成多步骤分析。
我们开发了一个程序(Seahawk),它允许生物学家使用以数据为中心而非传统的以服务为中心的方法,直观且无缝地将网络服务链接在一起。通过铁氧化还原蛋白突变分析来说明这种方法。Seahawk专注于降低生物学家的入门门槛:不需要对数据本体或相关服务有先验知识。与其他MOBY-S客户端形成鲜明对比的是,在Seahawk中,用户只需加载他们已经在使用的网页和文本文件。熟悉的网络浏览器交互背后是一个基于可扩展XSLT样式表、正则表达式和XPath语句的XML数据引擎,它将现有的用户数据导入MOBY-S格式。
作为一个易于访问的小程序,Seahawk超越了标准的网络浏览器交互,为生物学家提供了专注于分析任务而非数据格式和网络表单技术细节的机制。随着MOBY-S协议接近1.0规范,我们预计会有更多生物学家采用这些新的面向语义的基于网络的分析方式,这使他们能够在无需程序员协助的情况下创建更复杂的临时分析工作流程。