Wei Jennifer N, Ruiz Carlos, Vlot Marnix, Sanchez-Lengeling Benjamin, Lee Brian K, Berning Luuk, Vos Martijn W, Henderson Rob W M, Qian Wesley W, Sanders Jacob N, Ando D Michael, Groetsch Kurt M, Gerkin Richard C, Wiltschko Alexander B, Riffell Jeffrey A, Dechering Koen J
Google Research, Brain Team, Cambridge, MA, USA.
Department of Biology, University of Washington, Seattle, WA, USA.
Chem Senses. 2025 Jan 22;50. doi: 10.1093/chemse/bjaf021.
Insect-borne diseases kill > 0.5 million people annually. Currently available repellents for personal or household protection are limited in their efficacy, applicability, and safety profile. Here, we describe a machine-learning-driven high-throughput method for the discovery of novel repellent molecules. To achieve this, we digitized a large, historic dataset containing ~19,000 mosquito repellency measurements. We then trained a graph neural network (GNN) to map molecular structure and repellency. We applied this model to select 317 candidate molecules to test in parallelizable behavioral assays, quantifying repellency in multiple insect vectors of the pathogens of disease and in follow-up trials with human volunteers. The GNN approach outperformed a chemoinformatic model and produced a hit rate that increased with training data size, suggesting that both model innovation and novel data collection were integral to predictive accuracy. We identified > 10 molecules with repellency similar to or greater than the most widely used repellents. We analyzed the neural responses from the mosquito antennal (olfactory) lobe to selected repellents and found strong responses to many of the tested compounds, including those predicted to be strong repellents. Results from the antennal lobe recordings also demonstrated a correlation between the evoked responses to strong repellents and our GNN representation. This approach enables computational screening of billions of possible molecules to identify empirically tractable numbers of candidate repellents, leading to accelerated progress towards solving a global health challenge.
虫媒疾病每年导致超过50万人死亡。目前用于个人或家庭防护的驱虫剂在功效、适用性和安全性方面都存在局限。在此,我们描述了一种由机器学习驱动的高通量方法,用于发现新型驱虫分子。为此,我们将一个包含约19000次驱蚊效果测量数据的大型历史数据集进行了数字化处理。然后,我们训练了一个图神经网络(GNN)来关联分子结构和驱蚊效果。我们应用这个模型挑选出317种候选分子,在可并行化的行为分析中进行测试,量化这些分子对多种致病病原体昆虫载体的驱避效果,并在后续的人体志愿者试验中进行验证。GNN方法优于化学信息学模型,且命中率随训练数据量的增加而提高,这表明模型创新和新数据收集对于预测准确性都至关重要。我们鉴定出了10多种驱避效果与最广泛使用的驱虫剂相当或更强的分子。我们分析了蚊子触角(嗅觉)叶对选定驱虫剂的神经反应,发现对许多测试化合物都有强烈反应,包括那些预计具有强驱避效果的化合物。触角叶记录的结果还表明,对强驱虫剂的诱发反应与我们的GNN表征之间存在相关性。这种方法能够对数十亿种可能的分子进行计算筛选,以确定数量上便于实验研究的候选驱虫剂,从而加快应对全球卫生挑战的进展。