• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

OpenFlyData:一个整合了果蝇基因表达数据的数据网络范例。

OpenFlyData: an exemplar data web integrating gene expression data on the fruit fly Drosophila melanogaster.

机构信息

Department of Zoology, University of Oxford, Oxford OX1 3PS, UK.

出版信息

J Biomed Inform. 2010 Oct;43(5):752-61. doi: 10.1016/j.jbi.2010.04.004.

DOI:10.1016/j.jbi.2010.04.004
PMID:20382263
Abstract

MOTIVATION

Integrating heterogeneous data across distributed sources is a major requirement for in silico bioinformatics supporting translational research. For example, genome-scale data on patterns of gene expression in the fruit fly Drosophila melanogaster are widely used in functional genomic studies in many organisms to inform candidate gene selection and validate experimental results. However, current data integration solutions tend to be heavy weight, and require significant initial and ongoing investment of effort. Development of a common Web-based data integration infrastructure (a.k.a. data web), using Semantic Web standards, promises to alleviate these difficulties, but little is known about the feasibility, costs, risks or practical means of migrating to such an infrastructure.

RESULTS

We describe the development of OpenFlyData, a proof-of-concept system integrating gene expression data on D. melanogaster, combining Semantic Web standards with light-weight approaches to Web programming based on Web 2.0 design patterns. To support researchers designing and validating functional genomic studies, OpenFlyData includes user-facing search applications providing intuitive access to and comparison of gene expression data from FlyAtlas, the BDGP in situ database, and FlyTED, using data from FlyBase to expand and disambiguate gene names. OpenFlyData's services are also openly accessible, and are available for reuse by other bioinformaticians and application developers. Semi-automated methods and tools were developed to support labour- and knowledge-intensive tasks involved in deploying SPARQL services. These include methods for generating ontologies and relational-to-RDF mappings for relational databases, which we illustrate using the FlyBase Chado database schema; and methods for mapping gene identifiers between databases. The advantages of using Semantic Web standards for biomedical data integration are discussed, as are open issues. In particular, although the performance of open source SPARQL implementations is sufficient to query gene expression data directly from user-facing applications such as Web-based data fusions (a.k.a. mashups), we found open SPARQL endpoints to be vulnerable to denial-of-service-type problems, which must be mitigated to ensure reliability of services based on this standard. These results are relevant to data integration activities in translational bioinformatics.

AVAILABILITY

The gene expression search applications and SPARQL endpoints developed for OpenFlyData are deployed at http://openflydata.org. FlyUI, a library of JavaScript widgets providing re-usable user-interface components for Drosophila gene expression data, is available at http://flyui.googlecode.com. Software and ontologies to support transformation of data from FlyBase, FlyAtlas, BDGP and FlyTED to RDF are available at http://openflydata.googlecode.com. SPARQLite, an implementation of the SPARQL protocol, is available at http://sparqlite.googlecode.com. All software is provided under the GPL version 3 open source license.

摘要

动机

整合分布源中的异构数据是支持转化研究的计算生物信息学的主要要求。例如,在果蝇果蝇中基因表达模式的全基因组数据广泛用于许多生物体的功能基因组研究,以提供候选基因选择和验证实验结果。然而,当前的数据集成解决方案往往是重量级的,并且需要大量的初始和持续的努力。使用语义 Web 标准开发通用的基于 Web 的数据集成基础架构(也称为数据 Web)有望缓解这些困难,但对于迁移到这种基础架构的可行性、成本、风险或实际手段知之甚少。

结果

我们描述了 OpenFlyData 的开发,这是一个概念验证系统,它整合了果蝇的基因表达数据,将语义 Web 标准与基于 Web 2.0 设计模式的轻量级 Web 编程方法结合在一起。为了支持设计和验证功能基因组研究的研究人员,OpenFlyData 包括面向用户的搜索应用程序,使用 FlyBase 来扩展和消除基因名称的歧义,为 FlyAtlas、BDGP 原位数据库和 FlyTED 中的基因表达数据提供直观的访问和比较。OpenFlyData 的服务也可以公开访问,并可供其他生物信息学家和应用程序开发人员重用。开发了半自动方法和工具来支持部署 SPARQL 服务所涉及的劳动和知识密集型任务。这些方法包括为关系数据库生成本体和关系到 RDF 的映射的方法,我们使用 FlyBase Chado 数据库模式说明了这些方法;以及在数据库之间映射基因标识符的方法。讨论了使用语义 Web 标准进行生物医学数据集成的优势,以及存在的问题。特别是,尽管开源 SPARQL 实现的性能足以直接从基于 Web 的数据融合(也称为混搭)等面向用户的应用程序查询基因表达数据,但我们发现开放 SPARQL 端点容易受到拒绝服务类型问题的影响,必须缓解这些问题以确保基于此标准的服务的可靠性。这些结果与转化生物信息学中的数据集成活动有关。

可用性

为 OpenFlyData 开发的基因表达搜索应用程序和 SPARQL 端点部署在 http://openflydata.org 上。FlyUI 是一个 JavaScript 小部件库,为果蝇基因表达数据提供可重用的用户界面组件,可在 http://flyui.googlecode.com 上获得。支持将数据从 FlyBase、FlyAtlas、BDGP 和 FlyTED 转换为 RDF 的软件和本体可在 http://openflydata.googlecode.com 上获得。SPARQLite 是 SPARQL 协议的实现,可在 http://sparqlite.googlecode.com 上获得。所有软件均根据 GPL 版本 3 开源许可证提供。

相似文献

1
OpenFlyData: an exemplar data web integrating gene expression data on the fruit fly Drosophila melanogaster.OpenFlyData:一个整合了果蝇基因表达数据的数据网络范例。
J Biomed Inform. 2010 Oct;43(5):752-61. doi: 10.1016/j.jbi.2010.04.004.
2
A Chado case study: an ontology-based modular schema for representing genome-associated biological information.一个Chado案例研究:用于表示基因组相关生物信息的基于本体的模块化模式。
Bioinformatics. 2007 Jul 1;23(13):i337-46. doi: 10.1093/bioinformatics/btm189.
3
FlyTED: the Drosophila Testis Gene Expression Database.FlyTED:果蝇睾丸基因表达数据库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D710-5. doi: 10.1093/nar/gkp1006. Epub 2009 Nov 24.
4
FlyBase : a database for the Drosophila research community.FlyBase:一个面向果蝇研究群体的数据库。
Methods Mol Biol. 2008;420:45-59. doi: 10.1007/978-1-59745-583-1_3.
5
Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases.语义 JSON:一种用于整合多个生命科学数据库的语义 Web 内容的轻量级 Web 服务接口。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W533-40. doi: 10.1093/nar/gkr353. Epub 2011 Jun 1.
6
Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.从带注释的关系模式中生成丰富的 SPARQL 查询:应用于生物数据库语义 Web 服务创建。
BMC Bioinformatics. 2013 Apr 15;14:126. doi: 10.1186/1471-2105-14-126.
7
FlyBase 101--the basics of navigating FlyBase.FlyBase101——导航 FlyBase 的基础知识。
Nucleic Acids Res. 2012 Jan;40(Database issue):D706-14. doi: 10.1093/nar/gkr1030. Epub 2011 Nov 29.
8
BioFed: federated query processing over life sciences linked open data.BioFed:基于生命科学关联开放数据的联邦查询处理
J Biomed Semantics. 2017 Mar 15;8(1):13. doi: 10.1186/s13326-017-0118-0.
9
FlyAtlas: database of gene expression in the tissues of Drosophila melanogaster.FlyAtlas:黑腹果蝇组织中基因表达数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D744-50. doi: 10.1093/nar/gks1141. Epub 2012 Nov 29.
10
FlyBase: integration and improvements to query tools.果蝇数据库:查询工具的整合与改进
Nucleic Acids Res. 2008 Jan;36(Database issue):D588-93. doi: 10.1093/nar/gkm930. Epub 2007 Dec 26.

引用本文的文献

1
Functional Requirements for Medical Data Integration into Knowledge Management Environments: Requirements Elicitation Approach Based on Systematic Literature Analysis.医学数据集成到知识管理环境的功能需求:基于系统文献分析的需求 elicitation 方法。
J Med Internet Res. 2023 Feb 9;25:e41344. doi: 10.2196/41344.
2
WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata.WikiGenomes:一个供社区使用和管理维基数据中基因注释数据的开放网络应用程序。
Database (Oxford). 2017 Jan 1;2017(1). doi: 10.1093/database/bax025.
3
Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.
从带注释的关系模式中生成丰富的 SPARQL 查询:应用于生物数据库语义 Web 服务创建。
BMC Bioinformatics. 2013 Apr 15;14:126. doi: 10.1186/1471-2105-14-126.
4
Towards linked open gene mutations data.迈向关联的开放基因突变数据。
BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S7. doi: 10.1186/1471-2105-13-S4-S7.
5
Publishing Chinese medicine knowledge as Linked Data on the Web.在网络上发布中医药知识作为关联数据。
Chin Med. 2010 Jul 27;5:27. doi: 10.1186/1749-8546-5-27.