Suppr超能文献

史密斯:一个用于处理下一代测序工作流程的实验室信息管理系统。

SMITH: a LIMS for handling next-generation sequencing workflows.

出版信息

BMC Bioinformatics. 2014;15 Suppl 14(Suppl 14):S3. doi: 10.1186/1471-2105-15-S14-S3. Epub 2014 Nov 27.

Abstract

BACKGROUND

Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling).

METHODS

SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses.

RESULTS

SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc.

CONCLUSIONS

SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis.

摘要

背景

生命科学实验室越来越多地使用下一代测序(NGS)来研究生物大分子及其相互作用。用于测量基因表达或蛋白质-DNA 相互作用的基于阵列的方法正被 RNA-Seq 和 ChIP-Seq 取代。测序通常由专门的设施进行,这些设施必须跟踪测序请求、跟踪样本、确保质量并根据预定义的权限提供数据。一个集成的工具有助于解决问题,保持高质量标准,减少时间和成本。为此目的提供了称为 LIMS(实验室信息管理系统)的商业和非商业工具。然而,它们通常成本过高,或者缺乏灵活性和可扩展性,无法无缝地适应经常变化的协议。为了管理意大利技术研究所(IIT)基因组学部门产生的测序数据的流程,我们开发了 SMITH(测序机信息跟踪和处理)。

方法

SMITH 是一个带有 MySQL 服务器后端的 Web 应用程序。基因组科学中心的湿实验室科学家和米兰理工大学的数据库专家在基因组数据模型项目的背景下开发了 SMITH。数据库模式存储了 NGS 实验的所有信息,包括在该过程中使用的所有协议和算法的描述。值得注意的是,属性值表允许将无约束的文本描述与每个样本及其之后产生的所有数据相关联。这种方法允许创建元数据,可以用于搜索数据库以获取特定文件以及进行统计分析。

结果

SMITH 自动运行,主要将直接的人工交互限制在管理任务上。SMITH 的数据交付程序已经标准化,使生物学家和分析人员更容易导航数据。自动化还有助于节省时间。工作流程可通过工作流管理系统提供的 API 使用。将参数和输入数据传递到工作流引擎,该引擎执行解复用、质量控制、对齐等操作。

结论

SMITH 使测序工作流程标准化、自动化和加速。使用键值对对数据进行注释有助于元分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e58/4255740/2a93056aba1e/1471-2105-15-S14-S3-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验