• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通用实验室数据模型的设计与实现

Design and implementation of a generalized laboratory data model.

作者信息

Wendl Michael C, Smith Scott, Pohl Craig S, Dooling David J, Chinwalla Asif T, Crouse Kevin, Hepler Todd, Leong Shin, Carmichael Lynn, Nhan Mike, Oberkfell Benjamin J, Mardis Elaine R, Hillier LaDeana W, Wilson Richard K

机构信息

Genome Sequencing Center, Washington University, St, Louis, MO 63108, USA.

出版信息

BMC Bioinformatics. 2007 Sep 26;8:362. doi: 10.1186/1471-2105-8-362.

DOI:10.1186/1471-2105-8-362
PMID:17897463
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2194795/
Abstract

BACKGROUND

Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.

RESULTS

We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.

CONCLUSION

The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.

摘要

背景

生物科学领域的研究人员不断采用实验室自动化方法,极大地提高了数据生成的速度。在许多环境中,这些方法本身也在快速且灵活地发展。这些观察结果表明了强大的信息管理系统在现代实验室中的重要性。设计和实施这样的系统并非易事,而且在许多情况下,数据库项目最终被证明无法使用。

结果

我们描述了一个用于实验室数据的通用建模框架及其作为信息管理系统的实现。该模型采用了多种抽象技术,尤其侧重于继承和元数据的概念。传统方法以临时方式将面向事件的数据与常规实体数据混合在一起。相反,我们定义了不同的常规实体和事件模式,但通过标准化接口将它们完全集成。该设计允许将“处理管道”直接定义为一系列事件,从而无需单独的工作流管理系统。面向事件模式之上的一层通过定义“处理指令”将事件集成到工作流中,这些指令充当系统中项目的自动化项目经理。可以以几乎简单的方式添加或修改指令,即无需修改模式或重新认证应用程序。常规实体与事件之间的关联通过简单的“多对多”关系进行管理。我们描述了编程接口以及处理输入/输出、过程控制和状态转换的技术。

结论

这里描述的实现多年来一直是华盛顿大学基因组测序中心的主要信息系统。它处理每月约900万个各种测序反应的通量率下的所有事务,并轻松经受住了多次主要管道重新配置。基本数据模型可以很容易地适应其他高容量处理环境。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/29490b586556/1471-2105-8-362-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/92560835ba6d/1471-2105-8-362-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/f4d6238055be/1471-2105-8-362-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/752942769d01/1471-2105-8-362-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/2703176b6a04/1471-2105-8-362-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/d802e5f5a063/1471-2105-8-362-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/36a8fcf36f24/1471-2105-8-362-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/29490b586556/1471-2105-8-362-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/92560835ba6d/1471-2105-8-362-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/f4d6238055be/1471-2105-8-362-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/752942769d01/1471-2105-8-362-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/2703176b6a04/1471-2105-8-362-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/d802e5f5a063/1471-2105-8-362-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/36a8fcf36f24/1471-2105-8-362-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b73b/2194795/29490b586556/1471-2105-8-362-7.jpg

相似文献

1
Design and implementation of a generalized laboratory data model.通用实验室数据模型的设计与实现
BMC Bioinformatics. 2007 Sep 26;8:362. doi: 10.1186/1471-2105-8-362.
2
Ultra-Structure database design methodology for managing systems biology data and analyses.超结构数据库设计方法学,用于管理系统生物学数据和分析。
BMC Bioinformatics. 2009 Aug 19;10:254. doi: 10.1186/1471-2105-10-254.
3
A structured interface to the object-oriented genomics unified schema for XML-formatted data.用于XML格式数据的面向对象基因组学统一模式的结构化接口。
Appl Bioinformatics. 2005;4(1):13-24. doi: 10.2165/00822942-200504010-00002.
4
Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator.生物信息学中的工作流程:工作流程生成器的元分析与原型实现
BMC Bioinformatics. 2005 Apr 7;6:87. doi: 10.1186/1471-2105-6-87.
5
Biowep: a workflow enactment portal for bioinformatics applications.生物工作流引擎(Biowep):一个用于生物信息学应用的工作流制定门户。
BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S19. doi: 10.1186/1471-2105-8-S1-S19.
6
Modeling biology using relational databases.
Curr Protoc Bioinformatics. 2003 Feb;Chapter 9:Unit9.3. doi: 10.1002/0471250953.bi0903s00.
7
A repository based on a dynamically extensible data model supporting multidisciplinary research in neuroscience.一个基于动态可扩展数据模型的存储库,支持神经科学的多学科研究。
BMC Med Inform Decis Mak. 2012 Oct 8;12:115. doi: 10.1186/1472-6947-12-115.
8
Developing a modern web interface for database-driven bioinformatics tools.为数据库驱动的生物信息学工具开发一个现代的网络界面。
IEEE Eng Med Biol Mag. 2007 Mar-Apr;26(2):96-8. doi: 10.1109/memb.2007.335598.
9
An XML transfer schema for exchange of genomic and genetic mapping data: implementation as a web service in a Taverna workflow.用于基因组和遗传图谱数据交换的 XML 传输模式:作为 Taverna 工作流中的 Web 服务实现。
BMC Bioinformatics. 2009 Aug 14;10:252. doi: 10.1186/1471-2105-10-252.
10
A digital repository with an extensible data model for biobanking and genomic analysis management.一个具有可扩展数据模型的数字存储库,用于生物样本库和基因组分析管理。
BMC Genomics. 2014;15 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2164-15-S3-S3. Epub 2014 May 6.

引用本文的文献

1
SNPflow: a lightweight application for the processing, storing and automatic quality checking of genotyping assays.SNPflow:一个用于处理、存储和自动检测基因分型检测的轻量级应用程序。
PLoS One. 2013;8(3):e59508. doi: 10.1371/journal.pone.0059508. Epub 2013 Mar 19.
2
MolabIS--an integrated information system for storing and managing molecular genetics data.MolabIS——一个用于存储和管理分子遗传学数据的集成信息系统。
BMC Bioinformatics. 2011 Oct 31;12:425. doi: 10.1186/1471-2105-12-425.
3
A case study for efficient management of high throughput primary lab data.

本文引用的文献

1
MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools.MAGIC-SPP:一个由数据库驱动的DNA序列处理软件包及相关管理工具。
BMC Bioinformatics. 2006 Mar 7;7:115. doi: 10.1186/1471-2105-7-115.
2
ParPEST: a pipeline for EST data analysis based on parallel computing.ParPEST:一种基于并行计算的EST数据分析流程。
BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S9. doi: 10.1186/1471-2105-6-S4-S9.
3
HalX: an open-source LIMS (Laboratory Information Management System) for small- to large-scale laboratories.
高效管理高通量初级实验室数据的案例研究。
BMC Res Notes. 2011 Oct 17;4:413. doi: 10.1186/1756-0500-4-413.
HalX:一款适用于从小型到大型实验室的开源实验室信息管理系统(LIMS)。
Acta Crystallogr D Biol Crystallogr. 2005 Jun;61(Pt 6):671-8. doi: 10.1107/S0907444905001290. Epub 2005 May 26.
4
'PACLIMS': a component LIM system for high-throughput functional genomic analysis.“PACLIMS”:一种用于高通量功能基因组分析的组件式实验室信息管理系统
BMC Bioinformatics. 2005 Apr 12;6:94. doi: 10.1186/1471-2105-6-94.
5
Design of a data model for developing laboratory information management and analysis systems for protein production.用于开发蛋白质生产实验室信息管理与分析系统的数据模型设计
Proteins. 2005 Feb 1;58(2):278-84. doi: 10.1002/prot.20303.
6
ESTIMA, a tool for EST management in a multi-project environment.ESTIMA,一种用于多项目环境中EST管理的工具。
BMC Bioinformatics. 2004 Nov 4;5:176. doi: 10.1186/1471-2105-5-176.
7
Mutational profiling in the human genome.人类基因组中的突变分析。
Cold Spring Harb Symp Quant Biol. 2003;68:23-9. doi: 10.1101/sqb.2003.68.23.
8
Development of an integrated laboratory information management system for the maize mapping project.为玉米图谱项目开发一个集成实验室信息管理系统。
Bioinformatics. 2003 Nov 1;19(16):2022-30. doi: 10.1093/bioinformatics/btg274.
9
Informatics and data management in proteomics.
Trends Biotechnol. 2002 Dec;20(12 Suppl):S35-8. doi: 10.1016/s1471-1931(02)00198-2.
10
QuickLIMS: facilitating the data management for DNA-microarray fabrication.快速实验室信息管理系统(QuickLIMS):助力DNA微阵列制造的数据管理
Bioinformatics. 2003 Jan 22;19(2):283-4. doi: 10.1093/bioinformatics/19.2.283.