Polanski Jaroslaw, Kucia Urszula, Duszkiewicz Roksana, Kurczyk Agata, Magdziarz Tomasz, Gasteiger Johann
Institute of Chemistry, University of Silesia, 9 Szkolna Street, 40-006 Katowice, Poland.
Institute of Automatic Control, Silesian University of Technology, 16 Akademicka Street, 44-100 Gliwice, Poland.
Sci Rep. 2016 Jun 23;6:28521. doi: 10.1038/srep28521.
The relationship between the structure and a property of a chemical compound is an essential concept in chemistry guiding, for example, drug design. Actually, however, we need economic considerations to fully understand the fate of drugs on the market. We are performing here for the first time the exploration of quantitative structure-economy relationships (QSER) for a large dataset of a commercial building block library of over 2.2 million chemicals. This investigation provided molecular statistics that shows that on average what we are paying for is the quantity of matter. On the other side, the influence of synthetic availability scores is also revealed. Finally, we are buying substances by looking at the molecular graphs or molecular formulas. Thus, those molecules that have a higher number of atoms look more attractive and are, on average, also more expensive. Our study shows how data binning could be used as an informative method when analyzing big data in chemistry.
化合物的结构与性质之间的关系是化学中的一个重要概念,例如指导药物设计。然而,实际上我们需要从经济角度来全面理解市场上药物的命运。我们首次对一个包含超过220万种化学品的商业基础库的大型数据集进行了定量结构-经济性关系(QSER)的探索。这项研究提供的分子统计数据表明,平均而言我们所支付的是物质的量。另一方面,也揭示了合成可得性分数的影响。最后,我们通过查看分子图或分子式来购买物质。因此,那些原子数量较多的分子看起来更有吸引力,并且平均来说也更昂贵。我们的研究展示了在化学大数据分析中,数据分箱如何可以作为一种信息丰富的方法。