Suppr超能文献

动态可持续系统生物学资源的数据集成:挑战与经验教训。

Data integration for dynamic and sustainable systems biology resources: challenges and lessons learned.

机构信息

CyberInfrastructure Section, Virginia Bioinformatics Institute, Washington Street, MC 0477, Virginia Tech, Blacksburg, Virginia 24061, USA.

出版信息

Chem Biodivers. 2010 May;7(5):1124-41. doi: 10.1002/cbdv.200900317.

Abstract

Systems-biology and infectious-disease (host-pathogen-environment) research and development is becoming increasingly dependent on integrating data from diverse and dynamic sources. Maintaining integrated resources over long periods of time presents distinct challenges. This review describes experiences and lessons learned from integrating data in two five-year projects focused on pathosystems biology: the Pathosystems Resource Integration Center (PATRIC, http://patric.vbi.vt.edu/), with a goal of developing bioinformatics resources for the research and countermeasures-development communities based on genomics data, and the Resource Center for Biodefense Proteomics Research (RCBPR, http://www.proteomicsresource.org/), with a goal of developing resources based on the experiment data such as microarray and proteomics data from diverse sources and technologies. Some challenges include integrating genomic sequence and experiment data, data synchronization, data quality control, and usability engineering. We present examples of a variety of data-integration problems drawn from our experiences with PATRIC and RBPRC, as well as open research questions related to long-term sustainability, and describe the next steps to meeting these challenges. Novel contributions of this work include 1) an approach for addressing discrepancies between experiment results and interpreted results, and 2) expanding the range of data-integration techniques to include usability engineering at the presentation level.

摘要

系统生物学和传染病(宿主-病原体-环境)的研究和开发越来越依赖于整合来自不同和动态来源的数据。长期维护集成资源带来了明显的挑战。本综述描述了在两个专注于病理系统生物学的五年项目中整合数据的经验和教训:病理系统资源整合中心(PATRIC,http://patric.vbi.vt.edu/),其目标是基于基因组学数据为研究和对策开发生物信息学资源社区,以及生物防御蛋白质组学资源中心(RCBPR,http://www.proteomicsresource.org/),其目标是基于来自不同来源和技术的微阵列和蛋白质组学数据等实验数据开发资源。一些挑战包括整合基因组序列和实验数据、数据同步、数据质量控制和可用性工程。我们从 PATRIC 和 RCBPRC 的经验中,以及与长期可持续性相关的开放研究问题中,提出了各种数据集成问题的示例,并描述了应对这些挑战的下一步措施。这项工作的新贡献包括 1)解决实验结果和解释结果之间差异的方法,以及 2)扩展数据集成技术的范围,包括在演示层面上的可用性工程。

相似文献

2
PATRIC: the VBI PathoSystems Resource Integration Center.
Nucleic Acids Res. 2007 Jan;35(Database issue):D401-6. doi: 10.1093/nar/gkl858. Epub 2006 Nov 16.
3
BiologicalNetworks: visualization and analysis tool for systems biology.
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W466-71. doi: 10.1093/nar/gkl308.
4
Curation, integration and visualization of bacterial virulence factors in PATRIC.
Bioinformatics. 2015 Jan 15;31(2):252-8. doi: 10.1093/bioinformatics/btu631. Epub 2014 Sep 30.
5
PATRIC, the bacterial bioinformatics database and analysis resource.
Nucleic Acids Res. 2014 Jan;42(Database issue):D581-91. doi: 10.1093/nar/gkt1099. Epub 2013 Nov 12.
7
PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species.
Infect Immun. 2011 Nov;79(11):4286-98. doi: 10.1128/IAI.00207-11. Epub 2011 Sep 6.
8
Enabling high-throughput data management for systems biology: the Bioinformatics Resource Manager.
Bioinformatics. 2007 Apr 1;23(7):906-9. doi: 10.1093/bioinformatics/btm031. Epub 2007 Feb 25.
9
Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center.
Nucleic Acids Res. 2017 Jan 4;45(D1):D535-D542. doi: 10.1093/nar/gkw1017. Epub 2016 Nov 29.
10
The Gaggle: an open-source software system for integrating bioinformatics software and data sources.
BMC Bioinformatics. 2006 Mar 28;7:176. doi: 10.1186/1471-2105-7-176.

引用本文的文献

1
Informatics-Driven Infectious Disease Research.
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2013;273:3-11. doi: 10.1007/978-3-642-29752-6_1.
2
Graph databases in systems biology: a systematic review.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae561.
3
Dynamic integration of biological data sources using the data concierge.
Health Inf Sci Syst. 2013 Feb 4;1:7. doi: 10.1186/2047-2501-1-7. eCollection 2013.
4
Collaborative mining and interpretation of large-scale data for biomedical research insights.
PLoS One. 2014 Sep 30;9(9):e108600. doi: 10.1371/journal.pone.0108600. eCollection 2014.
5
AmalgamScope: merging annotations data across the human genome.
Biomed Res Int. 2014;2014:893501. doi: 10.1155/2014/893501. Epub 2014 May 20.
6
Mathematical and statistical modeling in cancer systems biology.
Front Physiol. 2012 Jun 28;3:227. doi: 10.3389/fphys.2012.00227. eCollection 2012.
7
PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species.
Infect Immun. 2011 Nov;79(11):4286-98. doi: 10.1128/IAI.00207-11. Epub 2011 Sep 6.
8
Genome studies at the PAG 2011 conference.
Funct Integr Genomics. 2011 Mar;11(1):1-11. doi: 10.1007/s10142-011-0215-6. Epub 2011 Mar 1.
9
Systems biology approaches to understanding mycobacterial survival mechanisms.
Drug Discov Today Dis Mech. 2010 Spring;7(1):e75-e82. doi: 10.1016/j.ddmec.2010.09.008.

本文引用的文献

1
Better bioinformatics through usability analysis.
Bioinformatics. 2009 Feb 1;25(3):406-12. doi: 10.1093/bioinformatics/btn633. Epub 2008 Dec 9.
2
Solving the problem of Trans-Genomic Query with alignment tables.
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):432-47. doi: 10.1109/TCBB.2007.1073.
3
A Semantic Web for bioinformatics: goals, tools, systems, applications.
BMC Bioinformatics. 2008 Apr 25;9 Suppl 4(Suppl 4):S1. doi: 10.1186/1471-2105-9-S4-S1.
4
Semantically linking and browsing PubMed abstracts with gene ontology.
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2164-9-S1-S10.
5
State of the nation in data integration for bioinformatics.
J Biomed Inform. 2008 Oct;41(5):687-93. doi: 10.1016/j.jbi.2008.01.008. Epub 2008 Feb 5.
6
HealthMap: global infectious disease monitoring through automated classification and visualization of Internet media reports.
J Am Med Inform Assoc. 2008 Mar-Apr;15(2):150-7. doi: 10.1197/jamia.M2544. Epub 2007 Dec 20.
8
Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics.
Bioinformatics. 2007 Nov 15;23(22):3073-9. doi: 10.1093/bioinformatics/btm452. Epub 2007 Oct 7.
9
A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data.
Bioinformatics. 2007 Nov 15;23(22):3080-7. doi: 10.1093/bioinformatics/btm461. Epub 2007 Sep 19.
10
PHIDIAS: a pathogen-host interaction data integration and analysis system.
Genome Biol. 2007;8(7):R150. doi: 10.1186/gb-2007-8-7-r150.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验