Suppr超能文献

第二届DBCLS生物黑客松:用于集成应用的可互操作生物信息学网络服务。

The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.

作者信息

Katayama Toshiaki, Wilkinson Mark D, Vos Rutger, Kawashima Takeshi, Kawashima Shuichi, Nakao Mitsuteru, Yamamoto Yasunori, Chun Hong-Woo, Yamaguchi Atsuko, Kawano Shin, Aerts Jan, Aoki-Kinoshita Kiyoko F, Arakawa Kazuharu, Aranda Bruno, Bonnal Raoul Jp, Fernández José M, Fujisawa Takatomo, Gordon Paul Mk, Goto Naohisa, Haider Syed, Harris Todd, Hatakeyama Takashi, Ho Isaac, Itoh Masumi, Kasprzyk Arek, Kido Nobuhiro, Kim Young-Joo, Kinjo Akira R, Konishi Fumikazu, Kovarskaya Yulia, von Kuster Greg, Labarga Alberto, Limviphuvadh Vachiranee, McCarthy Luke, Nakamura Yasukazu, Nam Yunsun, Nishida Kozo, Nishimura Kunihiro, Nishizawa Tatsuya, Ogishima Soichi, Oinn Tom, Okamoto Shinobu, Okuda Shujiro, Ono Keiichiro, Oshita Kazuki, Park Keun-Joon, Putnam Nicholas, Senger Martin, Severin Jessica, Shigemoto Yasumasa, Sugawara Hideaki, Taylor James, Trelles Oswaldo, Yamasaki Chisato, Yamashita Riu, Satoh Noriyuki, Takagi Toshihisa

机构信息

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.

出版信息

J Biomed Semantics. 2011 Aug 2;2:4. doi: 10.1186/2041-1480-2-4.

Abstract

BACKGROUND

The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009.

RESULTS

Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs.

CONCLUSIONS

Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

摘要

背景

生物研究人员与其所使用的生物信息学工具之间的交互仍因这些工具之间不完全的互操作性而受到阻碍。为确保互操作性举措得到有效部署,终端用户应用程序需要了解并支持最佳实践和标准。在此,我们报告一项软件开发人员和基因组生物学家共同探讨并提高对这些问题认识的举措:2009年生物黑客马拉松。

结果

与会的开发人员背景各异,有网络服务、工作流工具、文本挖掘和可视化方面的专家。基因组生物学家提供了来自序列和通路分析以及糖信息学领域的专业知识和示例数据。会议的一个目标是评估使用开发人员所代表的工具解决这些领域实际用例的能力。这产生了:i) 一个用于注释来自无脊椎动物物种的100,000个序列的工作流;ii) 一个基于从微阵列实验获得的差异基因表达数据来分析富集的转录因子结合位点(TFBSs)的集成系统;iii) 一个使用蛋白质结构数据来枚举代谢途径中酶之间假定物理蛋白质相互作用的工作流;iv) 一个通过在果蝇等其他物种中搜索糖基因的人类同源物并检索其表型注释的单核苷酸多态性(SNP)来分析糖基因相关疾病的工作流。

结论

除了为每个用例得出原型解决方案外,生物黑客马拉松的第二个主要目的是突出不足之处。我们讨论了在探索问题/解决方案空间过程中提出的问题,得出结论认为,网络服务的建模和注释方式仍然存在问题,包括:i) 网络服务“空间”中缺少几个有用的数据或分析功能;ii) 方法缺乏文档记录;iii) 各种编程语言库之间以及相互之间不符合SOAP/WSDL规范;iv) 各种生物信息学数据格式不兼容。尽管由于这些问题,解决与会生物研究人员向开发人员提出的现实世界问题仍然困难,但我们注意到在语义框架内解决这些问题的前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc22/3170566/3dda873f2d91/2041-1480-2-4-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验