Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland.
J Chem Inf Model. 2013 Aug 26;53(8):1979-89. doi: 10.1021/ci400206h. Epub 2013 Jul 30.
SMIfp (SMILES fingerprint) is defined here as a scalar fingerprint describing organic molecules by counting the occurrences of 34 different symbols in their SMILES strings, which creates a 34-dimensional chemical space. Ligand-based virtual screening using the city-block distance CBD(SMIfp) as similarity measure provides good AUC values and enrichment factors for recovering series of actives from the directory of useful decoys (DUD-E) and from ZINC. DrugBank, ChEMBL, ZINC, PubChem, GDB-11, GDB-13, and GDB-17 can be searched by CBD(SMIfp) using an online SMIfp-browser at www.gdb.unibe.ch. Visualization of the SMIfp chemical space was performed by principal component analysis and color-coded maps of the (PC1, PC2)-planes, with interactive access to the molecules enabled by the Java application SMIfp-MAPPLET available from www.gdb.unibe.ch. These maps spread molecules according to their fraction of aromatic atoms, size and polarity. SMIfp provides a new and relevant entry to explore the small molecule chemical space.
SMIfp(SMILES 指纹)在这里被定义为一种标量指纹,通过计算其 SMILES 字符串中 34 个不同符号的出现次数来描述有机分子,从而创建一个 34 维的化学空间。使用城市街区距离 CBD(SMIfp)作为相似性度量的基于配体的虚拟筛选为从有用的诱饵目录(DUD-E)和 ZINC 中恢复一系列活性物质提供了良好的 AUC 值和富集因子。可以通过在线 SMIfp 浏览器 www.gdb.unibe.ch 使用 CBD(SMIfp)搜索 DrugBank、ChEMBL、ZINC、PubChem、GDB-11、GDB-13 和 GDB-17。通过主成分分析和(PC1、PC2)-平面的颜色编码图对 SMIfp 化学空间进行可视化,并通过可从 www.gdb.unibe.ch 获得的 Java 应用程序 SMIfp-MAPPLET 实现对分子的交互式访问。这些地图根据芳香原子、大小和极性的分数来分布分子。SMIfp 提供了一种新的、相关的方法来探索小分子化学空间。