Suppr超能文献

CleanEST:一个经过清洗的EST文库数据库。

CleanEST: a database of cleansed EST libraries.

作者信息

Lee Byungwook, Shin Gwangsik

机构信息

Korean BioInformation Center, KRIBB, Daejeon 305-817, Korea.

出版信息

Nucleic Acids Res. 2009 Jan;37(Database issue):D686-9. doi: 10.1093/nar/gkn648. Epub 2008 Oct 2.

Abstract

The EST division of GenBank, dbEST, is widely used in many applications such as gene discovery and verification of exon-intron structure. However, the use of EST sequences in the dbEST libraries is often hampered by inconsistent terminology used to describe the library sources and by the presence of contaminated sequences. Here, we describe CleanEST, a novel database server that classified dbEST libraries and removes contaminants. We classified all dbEST libraries according to species and sequencing center. In addition, we further classified human EST libraries by anatomical and pathological systems according to eVOC ontologies. For each dbEST library, we provide two different cleansed sequences: 'pre-cleansed' and 'user-cleansed'. To generate pre-cleansed sequences, we cleansed sequences in dbEST by alignment of EST sequences against well-known contamination sources: UniVec, Escherichia coli, mitochondria and chloroplast (for plant). To provide user-cleansed sequences, we built an automatic user-cleansing pipeline, in which sequences of a user-selected library are cleansed on-the-fly according to user-selected options. The server is available at http://cleanest.kobic.re.kr/ and the database is updated monthly.

摘要

GenBank的EST部门,即dbEST,在许多应用中被广泛使用,如基因发现和外显子-内含子结构的验证。然而,dbEST文库中EST序列的使用常常受到用于描述文库来源的不一致术语以及污染序列存在的阻碍。在此,我们描述了CleanEST,一种新型的数据库服务器,它对dbEST文库进行分类并去除污染物。我们根据物种和测序中心对所有dbEST文库进行了分类。此外,我们根据eVOC本体论,通过解剖学和病理系统对人类EST文库进行了进一步分类。对于每个dbEST文库,我们提供两种不同的净化序列:“预净化”和“用户净化”。为了生成预净化序列,我们通过将EST序列与已知的污染源(UniVec、大肠杆菌、线粒体和叶绿体(对于植物))进行比对,对dbEST中的序列进行了净化。为了提供用户净化序列,我们构建了一个自动用户净化管道,其中用户选择的文库的序列会根据用户选择的选项实时进行净化。该服务器可在http://cleanest.kobic.re.kr/上获取,数据库每月更新一次。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4a4/2686460/9b1ba55b7580/gkn648f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验