Suppr超能文献

小分子的内核以及致突变性、毒性和抗癌活性的预测。

Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity.

作者信息

Swamidass S Joshua, Chen Jonathan, Bruand Jocelyne, Phung Peter, Ralaivola Liva, Baldi Pierre

机构信息

Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California Irvine, CA, USA.

出版信息

Bioinformatics. 2005 Jun;21 Suppl 1:i359-68. doi: 10.1093/bioinformatics/bti1055.

Abstract

MOTIVATION

Small molecules play a fundamental role in organic chemistry and biology. They can be used to probe biological systems and to discover new drugs and other useful compounds. As increasing numbers of large datasets of small molecules become available, it is necessary to develop computational methods that can deal with molecules of variable size and structure and predict their physical, chemical and biological properties.

RESULTS

Here we develop several new classes of kernels for small molecules using their 1D, 2D and 3D representations. In 1D, we consider string kernels based on SMILES strings. In 2D, we introduce several similarity kernels based on conventional or generalized fingerprints. Generalized fingerprints are derived by counting in different ways subpaths contained in the graph of bonds, using depth-first searches. In 3D, we consider similarity measures between histograms of pairwise distances between atom classes. These kernels can be computed efficiently and are applied to problems of classification and prediction of mutagenicity, toxicity and anti-cancer activity on three publicly available datasets. The results derived using cross-validation methods are state-of-the-art. Tradeoffs between various kernels are briefly discussed.

AVAILABILITY

Datasets available from http://www.igb.uci.edu/servers/servers.html

摘要

动机

小分子在有机化学和生物学中起着基础性作用。它们可用于探测生物系统以及发现新药和其他有用化合物。随着越来越多的小分子大型数据集可用,有必要开发能够处理大小和结构各异的分子并预测其物理、化学和生物学性质的计算方法。

结果

在此,我们利用小分子的一维、二维和三维表示开发了几类新的核函数。在一维中,我们考虑基于SMILES字符串的字符串核函数。在二维中,我们引入了基于传统或广义指纹的几种相似性核函数。广义指纹是通过使用深度优先搜索以不同方式对键图中包含的子路径进行计数而得出的。在三维中,我们考虑原子类别之间成对距离直方图的相似性度量。这些核函数能够高效计算,并应用于三个公开可用数据集上的致突变性、毒性和抗癌活性的分类与预测问题。使用交叉验证方法得出的结果处于当前先进水平。我们简要讨论了各种核函数之间的权衡。

可用性

数据集可从http://www.igb.uci.edu/servers/servers.html获取

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验