Bajorath Jürgen
Department of Life Science Informatics, B-IT, LIMES Program Unit, Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113, Bonn, Germany.
Methods Mol Biol. 2017;1526:247-256. doi: 10.1007/978-1-4939-6613-4_14.
In recent years, there has been unprecedented growth in compound activity data in the public domain. These compound data provide an indispensable resource for drug discovery in academic environments as well as in the pharmaceutical industry. To handle large volumes of heterogeneous and complex compound data and extract discovery-relevant knowledge from these data, advanced computational mining approaches are required. Herein, major public compound data repositories are introduced, data confidence criteria reviewed, and selected data mining approaches discussed.
近年来,公共领域的化合物活性数据呈现出前所未有的增长。这些化合物数据为学术环境以及制药行业的药物发现提供了不可或缺的资源。为了处理大量异构且复杂的化合物数据,并从这些数据中提取与发现相关的知识,需要先进的计算挖掘方法。在此,介绍了主要的公共化合物数据存储库,回顾了数据置信标准,并讨论了选定的数据挖掘方法。