Jusoh Ainin Sofia, Remli Muhammad Akmal, Mohamad Mohd Saberi, Cazenave Tristan, Fong Chin Siok
Institute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu, 16100, Kelantan, Malaysia; Faculty of Data Science and Computing, Universiti Malaysia Kelantan, Kota Bharu, 16100, Kelantan, Malaysia.
Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain, 15551, United Arab Emirates; Faculty of Engineering and Technology, Multimedia University, 75450, Melaka, Malaysia; Department of Biosystems Engineering, Faculty of Agricultural Technology, Universitas Brawijaya, 65145, Malang, East Java, Indonesia.
Eur J Med Chem. 2025 Oct 5;295:117825. doi: 10.1016/j.ejmech.2025.117825. Epub 2025 May 27.
Generative Artificial Intelligence (Generative AI) is transforming drug discovery by enabling advanced analysis of complex biological and chemical data. This review explores key Generative AI models, including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), flow-based models and Transformer-based models, with Transformers gaining prominence due to the abundance of text-based biological data and the success of language models like ChatGPT. The paper discusses molecular representations, performance evaluation metrics, and current trends in Generative AI-driven drug discovery, such as protein-protein interactions (PPIs), drug-target interactions (DTIs) and de-novo drug design. However, these approaches face significant challenges, including applicability domain issues, lack of interpretability, data scarcity, novelty, scalability, computational resource limitations, and the absence of standardized evaluation metrics. These challenges hinder model performance, complicate decision-making, and limit the generation of novel and viable drug candidates. To address these issues, strategies such as hybrid models, integration of multiomics datasets, explainable AI (XAI) techniques, data augmentation, transfer learning, and cloud-based solutions are proposed. Additionally, a curated list of databases supporting drug discovery research is provided. The review concludes by emphasizing the need for optimized AI models, robust validation methods, interdisciplinary collaboration, and future academic efforts to fully realize the potential of Generative AI in advancing drug discovery.
生成式人工智能(Generative AI)正在通过对复杂的生物和化学数据进行高级分析来改变药物发现。本综述探讨了关键的生成式人工智能模型,包括生成对抗网络(GANs)、变分自编码器(VAEs)、基于流的模型和基于Transformer的模型,由于基于文本的生物数据丰富以及ChatGPT等语言模型的成功,Transformer模型日益突出。本文讨论了分子表示、性能评估指标以及生成式人工智能驱动的药物发现的当前趋势,如蛋白质-蛋白质相互作用(PPIs)、药物-靶点相互作用(DTIs)和从头药物设计。然而,这些方法面临重大挑战,包括适用领域问题、缺乏可解释性、数据稀缺、新颖性、可扩展性、计算资源限制以及缺乏标准化评估指标。这些挑战阻碍了模型性能,使决策复杂化,并限制了新型可行药物候选物的产生。为了解决这些问题,提出了诸如混合模型、多组学数据集整合、可解释人工智能(XAI)技术、数据增强、迁移学习和基于云的解决方案等策略。此外,还提供了一份支持药物发现研究的数据库精选列表。综述最后强调,需要优化人工智能模型、强大的验证方法、跨学科合作以及未来的学术努力,以充分实现生成式人工智能在推进药物发现方面的潜力。