Cerisier Natacha, Truong Emily, Watanabe Taku, Oshiro Taro, Takahashi Tomohiro, Ito Shigeaki, Taboureau Olivier
Université Paris Cité, INSERM U1133, CNRS UMR 8251, 75013, Paris, France.
Scientific Product Assessment Center, Japan Tobacco Inc, 6-2, Umegaoka, Aoba-Ku, Yokohama, Kanagawa, 227-8512, Japan.
Mutagenesis. 2025 Aug 4. doi: 10.1093/mutage/geaf014.
The mutagenicity of chemical compounds is a key consideration in toxicology, drug development, and environmental safety. Traditional methods such as the Ames test, while reliable, are time-intensive and costly. With advances in imaging and machine learning, high-content assays like Cell Painting offer new opportunities for predictive toxicology. Cell Painting captures extensive morphological features of cells, which can correlate with chemical bioactivity. In this study, we leveraged Cell Painting data to develop machine learning models for predicting mutagenicity and compared their performance with structure-based models. We used two datasets: a Broad Institute dataset containing profiles of over 30,000 molecules and a US-EPA dataset with images of 1,200 chemicals tested at multiple concentrations. By integrating these datasets, we aimed to improve the robustness of our models. Among three algorithms tested - Random Forest, Support Vector Machine, and Extreme Gradient Boosting - the third showed the best performance for both datasets. Notably, selecting the most relevant concentration per compound, the Phenotypic Altering Concentration, significantly improved prediction accuracy. Our models outperformed traditional QSAR tools such as VEGA and the CompTox Dashboard for the majority of compounds, demonstrating the utility of Cell Painting features. The Cell Painting-based models revealed morphological changes related to DNA/RNA and ER perturbation, especially in mitochondria and nuclei, aligning with mutagenicity mechanisms. Despite this, certain compounds remained challenging to predict due to inherent dataset limitations and inter-laboratory variability in Cell Painting technology. The findings highlight the potential of Cell Painting in mutagenicity prediction, offering a complementary perspective to chemical structure-based models. Future work could involve harmonizing Cell Painting methodologies across datasets and exploring deep learning techniques to enhance predictive accuracy. Ultimately, integrating Cell Painting data with QSAR descriptors in hybrid models may unlock novel insights into chemical mutagenicity.
化合物的致突变性是毒理学、药物开发和环境安全中的一个关键考量因素。传统方法如艾姆斯试验虽然可靠,但耗时且成本高。随着成像技术和机器学习的进步,像细胞绘画这样的高内涵分析为预测毒理学提供了新机遇。细胞绘画可捕捉细胞广泛的形态特征,这些特征可能与化学生物活性相关。在本研究中,我们利用细胞绘画数据开发用于预测致突变性的机器学习模型,并将其性能与基于结构的模型进行比较。我们使用了两个数据集:一个是包含30000多个分子概况的布罗德研究所数据集,另一个是美国环境保护局数据集,其中有1200种化学物质在多个浓度下测试的图像。通过整合这些数据集,我们旨在提高模型的稳健性。在测试的三种算法——随机森林、支持向量机和极端梯度提升中,第三种算法在两个数据集上均表现出最佳性能。值得注意的是,选择每种化合物最相关的浓度,即表型改变浓度,显著提高了预测准确性。对于大多数化合物,我们的模型优于传统的定量构效关系工具,如VEGA和综合毒理学仪表盘,证明了细胞绘画特征的实用性。基于细胞绘画的模型揭示了与DNA/RNA和内质网扰动相关的形态变化,特别是在线粒体和细胞核中,这与致突变性机制相符。尽管如此,由于固有数据集的局限性以及细胞绘画技术在不同实验室间的可变性,某些化合物的预测仍具有挑战性。这些发现凸显了细胞绘画在致突变性预测中的潜力,为基于化学结构的模型提供了补充视角。未来的工作可能包括跨数据集协调细胞绘画方法,并探索深度学习技术以提高预测准确性。最终,在混合模型中将细胞绘画数据与定量构效关系描述符相结合,可能会揭示关于化学致突变性的新见解。