Suppr超能文献

Spfy:用于实时预测细菌表型和下游比较分析的集成图数据库。

Spfy: an integrated graph database for real-time prediction of bacterial phenotypes and downstream comparative analyses.

机构信息

National Microbiology Laboratory at Lethbridge, Public Health Agency of Canada, Lethbridge, Canada.

出版信息

Database (Oxford). 2018 Jan 1;2018:1-10. doi: 10.1093/database/bay086.

Abstract

Public health laboratories are currently moving to whole-genome sequence (WGS)-based analyses, and require the rapid prediction of standard reference laboratory methods based solely on genomic data. Currently, these predictive genomics tasks rely on workflows that chain together multiple programs for the requisite analyses. While useful, these systems do not store the analyses in a genome-centric way, meaning the same analyses are often re-computed for the same genomes. To solve this problem, we created Spfy, a platform that rapidly performs the common reference laboratory tests, uses a graph database to store and retrieve the results from the computational workflows and links data to individual genomes using standardized ontologies. The Spfy platform facilitates rapid phenotype identification, as well as the efficient storage and downstream comparative analysis of tens of thousands of genome sequences. Though generally applicable to bacterial genome sequences, Spfy currently contains 10 243 Escherichia coli genomes, for which in-silico serotype and Shiga-toxin subtype, as well as the presence of known virulence factors and antimicrobial resistance determinants have been computed. Additionally, the presence/absence of the entire E. coli pan-genome was computed and linked to each genome. Owing to its database of diverse pre-computed results, and the ability to easily incorporate user data, Spfy facilitates hypothesis testing in fields ranging from population genomics to epidemiology, while mitigating the re-computation of analyses. The graph approach of Spfy is flexible, and can accommodate new analysis software modules as they are developed, easily linking new results to those already stored. Spfy provides a database and analyses approach for E. coli that is able to match the rapid accumulation of WGS data in public databases.

摘要

公共卫生实验室目前正转向基于全基因组序列(WGS)的分析,并且需要仅基于基因组数据快速预测标准参考实验室方法。目前,这些预测基因组学任务依赖于将多个程序链接在一起进行必要分析的工作流程。虽然这些系统很有用,但它们并没有以基因组为中心的方式存储分析结果,这意味着对于相同的基因组,通常会重新计算相同的分析。为了解决这个问题,我们创建了 Spfy,这是一个快速执行常见参考实验室测试的平台,它使用图形数据库存储和检索计算工作流程的结果,并使用标准化本体将数据链接到各个基因组。Spfy 平台促进了快速表型识别,以及对成千上万基因组序列的高效存储和下游比较分析。尽管 Spfy 通常适用于细菌基因组序列,但它目前包含 10243 个大肠杆菌基因组,其中已经计算了大肠杆菌的血清型和志贺毒素亚型、已知毒力因子和抗生素耐药决定因素的存在。此外,还计算了整个大肠杆菌泛基因组的存在/不存在,并将其与每个基因组相关联。由于其多样化的预计算结果数据库,以及易于整合用户数据的能力,Spfy 促进了从群体基因组学到流行病学等领域的假设检验,同时减轻了分析的重新计算。Spfy 的图形方法具有灵活性,可以随着新分析软件模块的开发而轻松适应,将新结果轻松链接到已存储的结果。Spfy 为大肠杆菌提供了一个数据库和分析方法,能够与公共数据库中 WGS 数据的快速积累相匹配。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48df/6146121/5a8590e10b89/bay086f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验