Pharmaceutical Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., 4070 Basel, Switzerland.
Chem Res Toxicol. 2023 Sep 18;36(9):1503-1517. doi: 10.1021/acs.chemrestox.3c00137. Epub 2023 Aug 16.
approaches have acquired a towering role in pharmaceutical research and development, allowing laboratories all around the world to design, create, and optimize novel molecular entities with unprecedented efficiency. From a toxicological perspective, computational methods have guided the choices of medicinal chemists toward compounds displaying improved safety profiles. Even if the recent advances in the field are significant, many challenges remain active in the on-target and off-target prediction fields. Machine learning methods have shown their ability to identify molecules with safety concerns. However, they strongly depend on the abundance and diversity of data used for their training. Sharing such information among pharmaceutical companies remains extremely limited due to confidentiality reasons, but in this scenario, a recent concept named "federated learning" can help overcome such concerns. Within this framework, it is possible for companies to contribute to the training of common machine learning algorithms, using, but not sharing, their proprietary data. Very recently, Lhasa Limited organized a hackathon involving several industrial partners in order to assess the performance of their federated learning platform, called "Effiris". In this paper, we share our experience as Roche in participating in such an event, evaluating the performance of the federated algorithms and comparing them with those coming from our in-house-only machine learning models. Our aim is to highlight the advantages of federated learning and its intrinsic limitations and also suggest some points for potential improvements in the method.
方法在药物研发中已经占据了重要地位,使得世界各地的实验室能够以前所未有的效率设计、创造和优化新型分子实体。从毒理学的角度来看,计算方法引导药物化学家选择显示出改善安全性特征的化合物。尽管该领域的最新进展意义重大,但在靶标和脱靶预测领域仍存在许多挑战。机器学习方法已显示出识别具有安全问题的分子的能力。然而,它们强烈依赖于用于训练的数据的丰富性和多样性。由于保密原因,制药公司之间共享此类信息仍然受到极大限制,但在这种情况下,最近出现的一个名为“联邦学习”的概念可以帮助克服这些问题。在这个框架内,公司可以使用但不共享其专有数据,为共同的机器学习算法的训练做出贡献。最近,Lhasa Limited 组织了一次黑客马拉松活动,涉及几家工业合作伙伴,以评估其名为“Effiris”的联邦学习平台的性能。在本文中,我们分享了罗氏公司参与此类活动的经验,评估了联邦算法的性能,并将其与我们内部的机器学习模型进行了比较。我们的目的是强调联邦学习的优势及其内在局限性,并为该方法的潜在改进提出一些建议。