Viira Birgit, Gendron Thibault, Lanfranchi Don Antoine, Cojean Sandrine, Horvath Dragos, Marcou Gilles, Varnek Alexandre, Maes Louis, Maran Uko, Loiseau Philippe M, Davioud-Charvet Elisabeth
Institute of Chemistry, University of Tartu, 50411 Tartu, Estonia.
Bioorganic and Medicinal Chemistry Team, UMR 7509 CNRS-Université de Strasbourg, European School of Chemistry, Polymers and Materials (ECPM), 25, rue Becquerel, Strasbourg F-67087, France.
Molecules. 2016 Jun 29;21(7):853. doi: 10.3390/molecules21070853.
Malaria is a parasitic tropical disease that kills around 600,000 patients every year. The emergence of resistant Plasmodium falciparum parasites to artemisinin-based combination therapies (ACTs) represents a significant public health threat, indicating the urgent need for new effective compounds to reverse ACT resistance and cure the disease. For this, extensive curation and homogenization of experimental anti-Plasmodium screening data from both in-house and ChEMBL sources were conducted. As a result, a coherent strategy was established that allowed compiling coherent training sets that associate compound structures to the respective antimalarial activity measurements. Seventeen of these training sets led to the successful generation of classification models discriminating whether a compound has a significant probability to be active under the specific conditions of the antimalarial test associated with each set. These models were used in consensus prediction of the most likely active from a series of curcuminoids available in-house. Positive predictions together with a few predicted as inactive were then submitted to experimental in vitro antimalarial testing. A large majority from predicted compounds showed antimalarial activity, but not those predicted as inactive, thus experimentally validating the in silico screening approach. The herein proposed consensus machine learning approach showed its potential to reduce the cost and duration of antimalarial drug discovery.
疟疾是一种寄生性热带疾病,每年导致约60万人死亡。恶性疟原虫对基于青蒿素的联合疗法(ACTs)产生耐药性,这对公共卫生构成了重大威胁,表明迫切需要新的有效化合物来逆转ACT耐药性并治愈该疾病。为此,对来自内部和ChEMBL来源的实验性抗疟筛选数据进行了广泛的整理和同质化处理。结果,建立了一个连贯的策略,该策略允许编制连贯的训练集,将化合物结构与各自的抗疟活性测量值联系起来。其中17个训练集成功生成了分类模型,用于区分化合物在与每组相关的抗疟试验特定条件下是否有显著的活性概率。这些模型用于对内部可用的一系列姜黄素类化合物中最可能具有活性的化合物进行一致性预测。然后将阳性预测结果以及一些预测为无活性的结果提交到体外抗疟实验测试中。预测化合物中的绝大多数显示出抗疟活性,但预测为无活性的化合物则没有,从而通过实验验证了计算机模拟筛选方法。本文提出的一致性机器学习方法显示出其在降低抗疟药物发现成本和时间方面的潜力。