Suppr超能文献

GenoLIB:一个源自常见质粒特征文库的生物元件数据库。

GenoLIB: a database of biological parts derived from a library of common plasmid features.

作者信息

Adames Neil R, Wilson Mandy L, Fang Gang, Lux Matthew W, Glick Benjamin S, Peccoud Jean

机构信息

Virginia Bioinformatics Institute, Virginia Tech, 1015 Life Science Circle, Blacksburg, VA 24061, USA.

Virginia Bioinformatics Institute, Virginia Tech, 1015 Life Science Circle, Blacksburg, VA 24061, USA School of Biological Technology, Xi'an University of Arts and Science, Xi'an, Shaanxi Province 710065, China.

出版信息

Nucleic Acids Res. 2015 May 26;43(10):4823-32. doi: 10.1093/nar/gkv272. Epub 2015 Apr 29.

Abstract

Synthetic biologists rely on databases of biological parts to design genetic devices and systems. The sequences and descriptions of genetic parts are often derived from features of previously described plasmids using ad hoc, error-prone and time-consuming curation processes because existing databases of plasmids and features are loosely organized. These databases often lack consistency in the way they identify and describe sequences. Furthermore, legacy bioinformatics file formats like GenBank do not provide enough information about the purpose of features. We have analyzed the annotations of a library of ∼2000 widely used plasmids to build a non-redundant database of plasmid features. We looked at the variability of plasmid features, their usage statistics and their distributions by feature type. We segmented the plasmid features by expression hosts. We derived a library of biological parts from the database of plasmid features. The library was formatted using the Synthetic Biology Open Language, an emerging standard developed to better organize libraries of genetic parts to facilitate synthetic biology workflows. As proof, the library was converted into GenoCAD grammar files to allow users to import and customize the library based on the needs of their research projects.

摘要

合成生物学家依靠生物部件数据库来设计基因装置和系统。由于现有的质粒和特征数据库组织松散,基因部件的序列和描述通常来自先前描述的质粒的特征,采用的是临时、容易出错且耗时的整理过程。这些数据库在识别和描述序列的方式上往往缺乏一致性。此外,像GenBank这样的传统生物信息学文件格式没有提供足够的关于特征用途的信息。我们分析了约2000个广泛使用的质粒文库的注释,以构建一个非冗余的质粒特征数据库。我们研究了质粒特征的变异性、它们的使用统计数据以及按特征类型的分布。我们按表达宿主对质粒特征进行了分类。我们从质粒特征数据库中导出了一个生物部件文库。该文库使用合成生物学开放语言进行格式化,这是一种新兴的标准,旨在更好地组织基因部件文库,以促进合成生物学工作流程。作为验证,该文库被转换为GenoCAD语法文件,以便用户根据其研究项目的需求导入和定制该文库。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab72/4446419/cad54a6d3287/gkv272fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验