Dagan-Wiener Ayana, Nissim Ido, Ben Abu Natalie, Borgonovo Gigliola, Bassoli Angela, Niv Masha Y
Institute of Biochemistry, Food Science and Nutrition, The Robert H. Smith Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, 76100, Israel.
The Fritz Haber Center for Molecular Dynamics, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel.
Sci Rep. 2017 Sep 21;7(1):12074. doi: 10.1038/s41598-017-12359-7.
Bitter taste is an innately aversive taste modality that is considered to protect animals from consuming toxic compounds. Yet, bitterness is not always noxious and some bitter compounds have beneficial effects on health. Hundreds of bitter compounds were reported (and are accessible via the BitterDB http://bitterdb.agri.huji.ac.il/dbbitter.php ), but numerous additional bitter molecules are still unknown. The dramatic chemical diversity of bitterants makes bitterness prediction a difficult task. Here we present a machine learning classifier, BitterPredict, which predicts whether a compound is bitter or not, based on its chemical structure. BitterDB was used as the positive set, and non-bitter molecules were gathered from literature to create the negative set. Adaptive Boosting (AdaBoost), based on decision trees machine-learning algorithm was applied to molecules that were represented using physicochemical and ADME/Tox descriptors. BitterPredict correctly classifies over 80% of the compounds in the hold-out test set, and 70-90% of the compounds in three independent external sets and in sensory test validation, providing a quick and reliable tool for classifying large sets of compounds into bitter and non-bitter groups. BitterPredict suggests that about 40% of random molecules, and a large portion (66%) of clinical and experimental drugs, and of natural products (77%) are bitter.
苦味是一种天生令人厌恶的味觉模式,被认为能保护动物避免摄入有毒化合物。然而,苦味并不总是有害的,一些苦味化合物对健康有益。据报道,已有数百种苦味化合物(可通过苦味数据库http://bitterdb.agri.huji.ac.il/dbbitter.php获取),但仍有许多其他苦味分子未知。苦味剂的化学多样性极大,使得苦味预测成为一项艰巨任务。在此,我们提出一种机器学习分类器BitterPredict,它能根据化合物的化学结构预测其是否为苦味。苦味数据库被用作正集,从文献中收集非苦味分子以创建负集。基于决策树机器学习算法的自适应提升(AdaBoost)被应用于使用物理化学和ADME/Tox描述符表示的分子。BitterPredict在留一法测试集中能正确分类超过80%的化合物,在三个独立外部集和感官测试验证中能正确分类70 - 90%的化合物,为将大量化合物分为苦味和非苦味组提供了一种快速且可靠的工具。BitterPredict表明,约40%的随机分子、很大一部分(66%)的临床和实验药物以及天然产物(77%)是苦味的。