Suppr超能文献

化学数据库更新——全文搜索与虚拟化学空间

ChemDB update--full-text search and virtual chemical space.

作者信息

Chen Jonathan H, Linstead Erik, Swamidass S Joshua, Wang Dennis, Baldi Pierre

机构信息

Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California, Irvine, USA.

出版信息

Bioinformatics. 2007 Sep 1;23(17):2348-51. doi: 10.1093/bioinformatics/btm341. Epub 2007 Jun 28.

Abstract

UNLABELLED

ChemDB is a chemical database containing nearly 5M commercially available small molecules, important for use as synthetic building blocks, probes in systems biology and as leads for the discovery of drugs and other useful compounds. The data is publicly available over the web for download and for targeted searches using a variety of powerful methods. The chemical data includes predicted or experimentally determined physicochemical properties, such as 3D structure, melting temperature and solubility. Recent developments include optimization of chemical structure (and substructure) retrieval algorithms, enabling full database searches in less than a second. A text-based search engine allows efficient searching of compounds based on over 65M annotations from over 150 vendors. When searching for chemicals by name, fuzzy text matching capabilities yield productive results even when the correct spelling of a chemical name is unknown, taking advantage of both systematic and common names. Finally, built in reaction models enable searches through virtual chemical space, consisting of hypothetical products readily synthesizable from the building blocks in ChemDB.

AVAILABILITY

ChemDB and Supplementary Materials are available at http://cdb.ics.uci.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

未标注

ChemDB是一个化学数据库,包含近500万个可商购的小分子,对于用作合成构件、系统生物学中的探针以及作为发现药物和其他有用化合物的先导物很重要。该数据可通过网络公开获取,用于下载以及使用各种强大方法进行定向搜索。化学数据包括预测的或实验确定的物理化学性质,如三维结构、熔点和溶解度。最近的进展包括化学结构(和子结构)检索算法的优化,可在不到一秒的时间内完成全数据库搜索。一个基于文本的搜索引擎允许基于来自150多家供应商的超过6500万条注释对化合物进行高效搜索。当按名称搜索化学品时,即使在化学品名称的正确拼写未知的情况下,模糊文本匹配功能也能产生有效的结果,利用系统名称和常用名称进行搜索。最后,内置的反应模型能够在虚拟化学空间中进行搜索,该虚拟化学空间由可从ChemDB中的构件轻松合成的假设产物组成。

可用性

ChemDB和补充材料可在http://cdb.ics.uci.edu获取。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验