生物信息学中的数据捕获：使用Pedro的要求与经验

Data capture in bioinformatics: requirements and experiences with Pedro.

作者信息

Jameson Daniel, Garwood Kevin, Garwood Chris, Booth Tim, Alper Pinar, Oliver Stephen G, Paton Norman W

机构信息

School of Chemistry, Manchester Interdisciplinary Biocentre, The University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK.

出版信息

BMC Bioinformatics. 2008 Apr 10;9:183. doi: 10.1186/1471-2105-9-183.

DOI:10.1186/1471-2105-9-183

PMID:18402673

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2335277/

Abstract

BACKGROUND

The systematic capture of appropriately annotated experimental data is a prerequisite for most bioinformatics analyses. Data capture is required not only for submission of data to public repositories, but also to underpin integrated analysis, archiving, and sharing - both within laboratories and in collaborative projects. The widespread requirement to capture data means that data capture and annotation are taking place at many sites, but the small scale of the literature on tools, techniques and experiences suggests that there is work to be done to identify good practice and reduce duplication of effort.

RESULTS

This paper reports on experience gained in the deployment of the Pedro data capture tool in a range of representative bioinformatics applications. The paper makes explicit the requirements that have recurred when capturing data in different contexts, indicates how these requirements are addressed in Pedro, and describes case studies that illustrate where the requirements have arisen in practice.

CONCLUSION

Data capture is a fundamental activity for bioinformatics; all biological data resources build on some form of data capture activity, and many require a blend of import, analysis and annotation. Recurring requirements in data capture suggest that model-driven architectures can be used to construct data capture infrastructures that can be rapidly configured to meet the needs of individual use cases. We have described how one such model-driven infrastructure, namely Pedro, has been deployed in representative case studies, and discussed the extent to which the model-driven approach has been effective in practice.

摘要

背景

系统地捕获经过适当注释的实验数据是大多数生物信息学分析的先决条件。数据捕获不仅是将数据提交到公共存储库所必需的，也是支持实验室内部以及合作项目中的综合分析、存档和共享所必需的。广泛的数据捕获需求意味着数据捕获和注释正在许多地方进行，但关于工具、技术和经验的文献规模较小，这表明在确定良好实践和减少重复工作方面仍有工作要做。

结果

本文报告了在一系列具有代表性的生物信息学应用中部署佩德罗数据捕获工具所获得的经验。本文明确了在不同背景下捕获数据时反复出现的要求，指出了佩德罗如何满足这些要求，并描述了案例研究来说明这些要求在实际中出现的情况。

结论

数据捕获是生物信息学的一项基础活动；所有生物数据资源都建立在某种形式的数据捕获活动之上，并且许多资源需要导入、分析和注释的结合。数据捕获中反复出现的要求表明，模型驱动的架构可用于构建数据捕获基础设施，这些基础设施可以快速配置以满足各个用例的需求。我们已经描述了一个这样的模型驱动基础设施，即佩德罗，如何在具有代表性的案例研究中得到部署，并讨论了模型驱动方法在实际中有效的程度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2bd/2335277/6da4437881d7/1471-2105-9-183-1.jpg

相似文献

Data capture in bioinformatics: requirements and experiences with Pedro.

BMC Bioinformatics. 2008 Apr 10;9:183. doi: 10.1186/1471-2105-9-183.

Biowep: a workflow enactment portal for bioinformatics applications.

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S19. doi: 10.1186/1471-2105-8-S1-S19.

BioDownloader: bioinformatics downloads and updates in a few clicks.

Bioinformatics. 2007 Jun 1;23(11):1437-9. doi: 10.1093/bioinformatics/btm120. Epub 2007 May 5.

Developing a modern web interface for database-driven bioinformatics tools.

IEEE Eng Med Biol Mag. 2007 Mar-Apr;26(2):96-8. doi: 10.1109/memb.2007.335598.

Bioinformatics integration and agent technology.

J Biomed Inform. 2004 Jun;37(3):205-19. doi: 10.1016/j.jbi.2004.04.003.

Evolution of web services in bioinformatics.

Brief Bioinform. 2005 Jun;6(2):178-88. doi: 10.1093/bib/6.2.178.

A System for Information Management in BioMedical Studies--SIMBioMS.

Bioinformatics. 2009 Oct 15;25(20):2768-9. doi: 10.1093/bioinformatics/btp420. Epub 2009 Jul 24.

Automation of in-silico data analysis processes through workflow management systems.

Brief Bioinform. 2008 Jan;9(1):57-68. doi: 10.1093/bib/bbm056. Epub 2007 Dec 2.

Enabling high-throughput data management for systems biology: the Bioinformatics Resource Manager.

Bioinformatics. 2007 Apr 1;23(7):906-9. doi: 10.1093/bioinformatics/btm031. Epub 2007 Feb 25.

ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S5. doi: 10.1186/1471-2105-9-S9-S5.

引用本文的文献

e!DAL--a framework to store, share and publish research data.

BMC Bioinformatics. 2014 Jun 24;15:214. doi: 10.1186/1471-2105-15-214.

Knowledge management for systems biology a general and visually driven framework applied to translational medicine.

BMC Syst Biol. 2011 Mar 5;5:38. doi: 10.1186/1752-0509-5-38.

The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button.

BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S12. doi: 10.1186/1471-2105-11-S12-S12.

XGAP: a uniform and extensible data model and software platform for genotype and phenotype experiments.

Genome Biol. 2010;11(3):R27. doi: 10.1186/gb-2010-11-3-r27. Epub 2010 Mar 9.

Information management for high content live cell imaging.

BMC Bioinformatics. 2009 Jul 21;10:226. doi: 10.1186/1471-2105-10-226.

本文引用的文献

Beyond standardization: dynamic software infrastructures for systems biology.

Nat Rev Genet. 2007 Mar;8(3):235-43. doi: 10.1038/nrg2048. Epub 2007 Feb 13.

GenBank.

Nucleic Acids Res. 2007 Jan;35(Database issue):D21-5. doi: 10.1093/nar/gkl986.

Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it.

BMC Bioinformatics. 2006 Dec 14;7:532. doi: 10.1186/1471-2105-7-532.

The Molecular Biology Database Collection: 2007 update.

Nucleic Acids Res. 2007 Jan;35(Database issue):D3-4. doi: 10.1093/nar/gkl1008. Epub 2006 Dec 5.

ArrayExpress--a public database of microarray experiments and gene expression profiles.

Nucleic Acids Res. 2007 Jan;35(Database issue):D747-50. doi: 10.1093/nar/gkl995. Epub 2006 Nov 28.

A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB.

BMC Bioinformatics. 2006 Nov 6;7:489. doi: 10.1186/1471-2105-7-489.

Automated tracking of gene expression in individual cells and cell compartments.

J R Soc Interface. 2006 Dec 22;3(11):787-94. doi: 10.1098/rsif.2006.0137.

Gene expression omnibus: microarray data storage, submission, retrieval, and analysis.

Methods Enzymol. 2006;411:352-69. doi: 10.1016/S0076-6879(06)11019-8.

The Proteomics Identifications Database (PRIDE) and the ProteomExchange Consortium: making proteomics data accessible.

Expert Rev Proteomics. 2006 Feb;3(1):1-3. doi: 10.1586/14789450.3.1.1.

Pfam: clans, web tools and services.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D247-51. doi: 10.1093/nar/gkj149.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生物信息学中的数据捕获：使用Pedro的要求与经验

Data capture in bioinformatics: requirements and experiences with Pedro.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献