Oregon Health & Science University, Library, LIB, 3181 S.W. Sam Jackson Park Rd., Portland, OR 97239-3098, USA.
Database (Oxford). 2012 Mar 20;2012:bar067. doi: 10.1093/database/bar067. Print 2012.
Development of biocuration processes and guidelines for new data types or projects is a challenging task. Each project finds its way toward defining annotation standards and ensuring data consistency with varying degrees of planning and different tools to support and/or report on consistency. Further, this process may be data type specific even within the context of a single project. This article describes our experiences with eagle-i, a 2-year pilot project to develop a federated network of data repositories in which unpublished, unshared or otherwise 'invisible' scientific resources could be inventoried and made accessible to the scientific community. During the course of eagle-i development, the main challenges we experienced related to the difficulty of collecting and curating data while the system and the data model were simultaneously built, and a deficiency and diversity of data management strategies in the laboratories from which the source data was obtained. We discuss our approach to biocuration and the importance of improving information management strategies to the research process, specifically with regard to the inventorying and usage of research resources. Finally, we highlight the commonalities and differences between eagle-i and similar efforts with the hope that our lessons learned will assist other biocuration endeavors. DATABASE URL: www.eagle-i.net.
开发针对新数据类型或项目的生物注释流程和指南是一项具有挑战性的任务。每个项目都在寻找定义注释标准的方法,并确保数据与不同程度的规划和不同的工具保持一致,这些工具用于支持和/或报告一致性。此外,即使在单个项目的范围内,这个过程也可能是特定于数据类型的。本文介绍了我们在 eagle-i 项目中的经验,这是一个为期 2 年的试点项目,旨在开发一个联邦数据存储库网络,在这个网络中,可以对未发布、未共享或其他“不可见”的科学资源进行编目,并向科学界提供这些资源。在 eagle-i 开发过程中,我们遇到的主要挑战与在同时构建系统和数据模型时收集和注释数据的难度有关,以及从获取源数据的实验室中数据管理策略的缺乏和多样性。我们讨论了我们的生物注释方法以及改进信息管理策略对研究过程的重要性,特别是在研究资源的编目和使用方面。最后,我们强调了 eagle-i 与类似努力之间的共同点和差异,希望我们的经验教训将有助于其他生物注释工作。数据库 URL:www.eagle-i.net。