InfoChem GmbH , Landsberger Strasse 408/V, D-81241, Munich, Bavaria, Germany.
J Chem Inf Model. 2013 Nov 25;53(11):2884-95. doi: 10.1021/ci400442f. Epub 2013 Oct 25.
Reaction classification has important applications, and many approaches to classification have been applied. Our own algorithm tests all maximum common substructures (MCS) between all reactant and product molecules in order to find an atom mapping containing the minimum chemical distance (MCD). Recent publications have concluded that new MCS algorithms need to be compared with existing methods in a reproducible environment, preferably on a generalized test set, yet the number of test sets available is small, and they are not truly representative of the range of reactions that occur in real reaction databases. We have designed a challenging test set of reactions and are making it publicly available and usable with InfoChem's software or other classification algorithms. We supply a representative set of example reactions, grouped into different levels of difficulty, from a large number of reaction databases that chemists actually encounter in practice, in order to demonstrate the basic requirements for a mapping algorithm to detect the reaction centers in a consistent way. We invite the scientific community to contribute to the future extension and improvement of this data set, to achieve the goal of a common standard.
反应分类具有重要的应用,已经有许多分类方法被应用。我们自己的算法测试所有反应物和产物分子之间的所有最大公共子结构 (MCS),以找到包含最小化学距离 (MCD) 的原子映射。最近的出版物得出结论,新的 MCS 算法需要在可重复的环境中与现有方法进行比较,最好是在广义测试集上,但可用的测试集数量很少,并且它们不能真正代表实际反应数据库中发生的反应范围。我们设计了一个具有挑战性的反应测试集,并将其公开提供,可与 InfoChem 的软件或其他分类算法一起使用。我们提供了一组有代表性的示例反应,这些反应是从化学家在实际实践中遇到的大量反应数据库中分组为不同难度级别,以便展示映射算法以一致的方式检测反应中心的基本要求。我们邀请科学界为这个数据集的未来扩展和改进做出贡献,以实现共同标准的目标。