Suppr超能文献

结构基因组学中的数据管理:概述

Data management in structural genomics: an overview.

作者信息

Haquin Sabrina, Oeuillet Eric, Pajon Anne, Harris Mark, Jones Alwyn T, van Tilbeurgh Herman, Markley John L, Zolnai Zolt, Poupon Anne

机构信息

Yeast Structural Genomics, IBBMC, Université Paris-Sud, Orsay, France.

出版信息

Methods Mol Biol. 2008;426:49-79. doi: 10.1007/978-1-60327-058-8_4.

Abstract

Data management has been identified as a crucial issue in all large-scale experimental projects. In this type of project, many different persons manipulate multiple objects in different locations; thus, unless complete and accurate records are maintained, it is extremely difficult to understand exactly what has been done, when it was done, who did it, and what exact protocol was used. All of this information is essential for use in publications, reusing successful protocols, determining why a target has failed, and validating and optimizing protocols. Although data management solutions have been in place for certain focused activities (e.g., genome sequencing and microarray experiments), they are just emerging for more widespread projects, such as structural genomics, metabolomics, and systems biology as a whole. The complexity of experimental procedures, and the diversity and high rate of development of protocols used in a single center, or across various centers, have important consequences for the design of information management systems. Because procedures are carried out by both machines and hand, the system must be capable of handling data entry both from robotic systems and by means of a user-friendly interface. The information management system needs to be flexible so it can handle changes in existing protocols or newly added protocols. Because no commercial information management systems have had the needed features, most structural genomics groups have developed their own solutions. This chapter discusses the advantages of using a LIMS (laboratory information management system), for day-to-day management of structural genomics projects, and also for data mining. This chapter reviews different solutions currently in place or under development with emphasis on three systems developed by the authors: Xtrack, Sesame (developed at the Center for Eukaryotic Structural Genomics under the US Protein Structural Genomics Initiative), and HalX (developed at the Yeast Structural Genomics Laboratory, in collaboration with the European SPINE project).

摘要

数据管理已被视为所有大型实验项目中的关键问题。在这类项目中,许多不同的人员在不同地点操作多个对象;因此,除非保持完整准确的记录,否则极难确切了解做了什么、何时做的、谁做的以及使用了何种确切方案。所有这些信息对于出版物使用、复用成功方案、确定目标为何失败以及验证和优化方案都至关重要。尽管针对某些特定活动(如基因组测序和微阵列实验)已有数据管理解决方案,但它们才刚刚在更广泛的项目中出现,例如结构基因组学、代谢组学以及整个系统生物学。实验程序的复杂性以及单个中心或不同中心使用的方案的多样性和高发展速度,对信息管理系统的设计具有重要影响。由于程序由机器和人工共同执行,系统必须能够处理来自机器人系统的数据录入以及通过用户友好界面的数据录入。信息管理系统需要具备灵活性,以便能够处理现有方案的变更或新添加的方案。由于没有商业信息管理系统具备所需功能,大多数结构基因组学团队都开发了自己的解决方案。本章讨论使用实验室信息管理系统(LIMS)对结构基因组学项目进行日常管理以及数据挖掘的优势。本章回顾了目前已有的或正在开发的不同解决方案,重点介绍了作者开发的三个系统:Xtrack、Sesame(由美国蛋白质结构基因组学计划资助的真核生物结构基因组学中心开发)和HalX(由酵母结构基因组学实验室与欧洲SPINE项目合作开发)。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验