Le Goff François, Hazemann Julien, Christen Lukas, Bourquin Geoffroy, Pierlot Gabin, Lange Roland, Panchaud Philippe, Zumbrunn Cornelia, Peter Oliver, Rueedi Georg, Ritz Daniel
Idorsia Pharmaceuticals Ltd, Hegenheimermattweg 91, 4123, Allschwil, Switzerland.
Genedata AG, Margarethenstrasse 38, 4053, Basel, Switzerland.
Sci Rep. 2025 Jul 14;15(1):25431. doi: 10.1038/s41598-025-10208-6.
The challenge of assessing intracellular accumulation represents a major hurdle to the discovery of new antibiotics with Gram-negative activity. To address this, a high-throughput assay was developed to measure compound uptake and retention in Escherichia coli using LC/MS. 13,056 diverse small molecules were screened with two isogenic E. coli strains, a wild-type and a TolC-deleted mutant. Cell-associated concentrations of 8,410 compounds were determined and 6,416 compounds were classified either as retention-positive or -negative, with 45% (2,885) positives in the TolC mutant. Of these, 60% were not retained in the wild-type strain, indicating efficient efflux. No individual structural feature or physicochemical property explained the retention behavior. Machine learning (ML) models were trained using these results, and a gradient-boosted-tree model using 36 physicochemical compound descriptors proved most accurate. The ML model demonstrated robust performance across similar and dissimilar molecule subsets, showcasing its strong generalization and real-world predictive potential. An experimental validation of the tool was performed with a set of 540 new compounds and correctly predicted retention-positive cases in 77.8% and retention-negative in 74.4%. This assay and prediction tool could enhance Gram-negative antibiotic discovery, aiding in screening library design, computational structure-based drug design, and exploration of chemical space before synthesis.
评估细胞内积累的挑战是发现具有抗革兰氏阴性菌活性的新型抗生素的主要障碍。为了解决这一问题,开发了一种高通量检测方法,使用液相色谱/质谱法测量化合物在大肠杆菌中的摄取和保留情况。用两种同基因大肠杆菌菌株(野生型和TolC缺失突变体)对13056种不同的小分子进行了筛选。测定了8410种化合物与细胞相关的浓度,并将6416种化合物分类为保留阳性或阴性,其中TolC突变体中有45%(2885种)为阳性。其中,60%在野生型菌株中未被保留,表明存在有效的外排。没有单个结构特征或物理化学性质能够解释保留行为。利用这些结果训练了机器学习(ML)模型,使用36种物理化学化合物描述符的梯度提升树模型被证明是最准确的。该ML模型在相似和不相似的分子子集中都表现出强大的性能,展示了其强大的泛化能力和实际预测潜力。用一组540种新化合物对该工具进行了实验验证,正确预测了77.8%的保留阳性病例和74.4%的保留阴性病例。这种检测方法和预测工具可以加强革兰氏阴性菌抗生素的发现,有助于筛选文库设计、基于计算结构的药物设计以及合成前化学空间的探索。